Overview

Brought to you by YData

Dataset statistics

Number of variables50
Number of observations101766
Missing cells181168
Missing cells (%)3.6%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory192.9 MiB
Average record size in memory1.9 KiB

Variable types

Numeric13
Categorical30
Text4
Boolean3

Alerts

examide has constant value "False" Constant
citoglipton has constant value "False" Constant
A1Cresult is highly overall correlated with acetohexamide and 3 other fieldsHigh correlation
acetohexamide is highly overall correlated with A1Cresult and 1 other fieldsHigh correlation
change is highly overall correlated with diabetesMed and 1 other fieldsHigh correlation
diabetesMed is highly overall correlated with change and 1 other fieldsHigh correlation
encounter_id is highly overall correlated with patient_nbrHigh correlation
glimepiride-pioglitazone is highly overall correlated with A1Cresult and 1 other fieldsHigh correlation
glipizide-metformin is highly overall correlated with max_glu_serumHigh correlation
insulin is highly overall correlated with change and 1 other fieldsHigh correlation
max_glu_serum is highly overall correlated with acetohexamide and 7 other fieldsHigh correlation
metformin-pioglitazone is highly overall correlated with A1Cresult and 1 other fieldsHigh correlation
metformin-rosiglitazone is highly overall correlated with max_glu_serumHigh correlation
miglitol is highly overall correlated with max_glu_serumHigh correlation
patient_nbr is highly overall correlated with encounter_idHigh correlation
troglitazone is highly overall correlated with A1Cresult and 1 other fieldsHigh correlation
weight is highly overall correlated with max_glu_serumHigh correlation
race is highly imbalanced (55.9%) Imbalance
weight is highly imbalanced (92.0%) Imbalance
metformin is highly imbalanced (59.5%) Imbalance
repaglinide is highly imbalanced (93.9%) Imbalance
nateglinide is highly imbalanced (96.9%) Imbalance
chlorpropamide is highly imbalanced (99.5%) Imbalance
glimepiride is highly imbalanced (84.0%) Imbalance
acetohexamide is highly imbalanced (> 99.9%) Imbalance
glipizide is highly imbalanced (69.2%) Imbalance
glyburide is highly imbalanced (72.3%) Imbalance
tolbutamide is highly imbalanced (99.7%) Imbalance
pioglitazone is highly imbalanced (80.2%) Imbalance
rosiglitazone is highly imbalanced (82.2%) Imbalance
acarbose is highly imbalanced (98.5%) Imbalance
miglitol is highly imbalanced (99.7%) Imbalance
troglitazone is highly imbalanced (> 99.9%) Imbalance
tolazamide is highly imbalanced (99.7%) Imbalance
glyburide-metformin is highly imbalanced (97.0%) Imbalance
glipizide-metformin is highly imbalanced (99.8%) Imbalance
glimepiride-pioglitazone is highly imbalanced (> 99.9%) Imbalance
metformin-rosiglitazone is highly imbalanced (> 99.9%) Imbalance
metformin-pioglitazone is highly imbalanced (> 99.9%) Imbalance
max_glu_serum has 96420 (94.7%) missing values Missing
A1Cresult has 84748 (83.3%) missing values Missing
number_emergency is highly skewed (γ1 = 22.85558215) Skewed
encounter_id has unique values Unique
num_procedures has 46652 (45.8%) zeros Zeros
number_outpatient has 85027 (83.6%) zeros Zeros
number_emergency has 90383 (88.8%) zeros Zeros
number_inpatient has 67630 (66.5%) zeros Zeros

Reproduction

Analysis started2025-09-12 11:57:09.076671
Analysis finished2025-09-12 11:57:49.991952
Duration40.92 seconds
Software versionydata-profiling vv4.16.1
Download configurationconfig.json

Variables

encounter_id
Real number (ℝ)

High correlation  Unique 

Distinct101766
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.6520165 × 108
Minimum12522
Maximum4.4386722 × 108
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size795.2 KiB
2025-09-12T12:57:50.078644image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum12522
5-th percentile27170784
Q184961194
median1.5238899 × 108
Q32.3027089 × 108
95-th percentile3.7896284 × 108
Maximum4.4386722 × 108
Range4.438547 × 108
Interquartile range (IQR)1.4530969 × 108

Descriptive statistics

Standard deviation1.026403 × 108
Coefficient of variation (CV)0.62130311
Kurtosis-0.10207139
Mean1.6520165 × 108
Median Absolute Deviation (MAD)70921143
Skewness0.69914155
Sum1.6811911 × 1013
Variance1.053503 × 1016
MonotonicityNot monotonic
2025-09-12T12:57:50.196267image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2278392 1
 
< 0.1%
190792044 1
 
< 0.1%
190790070 1
 
< 0.1%
190789722 1
 
< 0.1%
190786806 1
 
< 0.1%
190785018 1
 
< 0.1%
190781412 1
 
< 0.1%
190775886 1
 
< 0.1%
190764504 1
 
< 0.1%
190760322 1
 
< 0.1%
Other values (101756) 101756
> 99.9%
ValueCountFrequency (%)
12522 1
< 0.1%
15738 1
< 0.1%
16680 1
< 0.1%
28236 1
< 0.1%
35754 1
< 0.1%
36900 1
< 0.1%
40926 1
< 0.1%
42570 1
< 0.1%
55842 1
< 0.1%
62256 1
< 0.1%
ValueCountFrequency (%)
443867222 1
< 0.1%
443857166 1
< 0.1%
443854148 1
< 0.1%
443847782 1
< 0.1%
443847548 1
< 0.1%
443847176 1
< 0.1%
443842778 1
< 0.1%
443842340 1
< 0.1%
443842136 1
< 0.1%
443842070 1
< 0.1%

patient_nbr
Real number (ℝ)

High correlation 

Distinct71518
Distinct (%)70.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean54330401
Minimum135
Maximum1.8950262 × 108
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size795.2 KiB
2025-09-12T12:57:50.310311image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum135
5-th percentile1456971.8
Q123413221
median45505143
Q387545950
95-th percentile1.1148027 × 108
Maximum1.8950262 × 108
Range1.8950248 × 108
Interquartile range (IQR)64132729

Descriptive statistics

Standard deviation38696359
Coefficient of variation (CV)0.71224138
Kurtosis-0.34737204
Mean54330401
Median Absolute Deviation (MAD)32950134
Skewness0.47128072
Sum5.5289876 × 1012
Variance1.4974082 × 1015
MonotonicityNot monotonic
2025-09-12T12:57:50.432069image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
88785891 40
 
< 0.1%
43140906 28
 
< 0.1%
1660293 23
 
< 0.1%
88227540 23
 
< 0.1%
23199021 23
 
< 0.1%
23643405 22
 
< 0.1%
84428613 22
 
< 0.1%
92709351 21
 
< 0.1%
88789707 20
 
< 0.1%
29903877 20
 
< 0.1%
Other values (71508) 101524
99.8%
ValueCountFrequency (%)
135 2
 
< 0.1%
378 1
 
< 0.1%
729 1
 
< 0.1%
774 1
 
< 0.1%
927 1
 
< 0.1%
1152 5
< 0.1%
1305 1
 
< 0.1%
1314 3
< 0.1%
1629 1
 
< 0.1%
2025 1
 
< 0.1%
ValueCountFrequency (%)
189502619 1
< 0.1%
189481478 1
< 0.1%
189445127 1
< 0.1%
189365864 1
< 0.1%
189351095 1
< 0.1%
189349430 1
< 0.1%
189332087 1
< 0.1%
189298877 1
< 0.1%
189257846 2
< 0.1%
189215762 1
< 0.1%

race
Categorical

Imbalance 

Distinct6
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size5.7 MiB
Caucasian
76099 
AfricanAmerican
19210 
?
 
2273
Hispanic
 
2037
Other
 
1506

Length

Max length15
Median length9
Mean length9.8495077
Min length1

Characters and Unicode

Total characters1002345
Distinct characters18
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowCaucasian
2nd rowCaucasian
3rd rowAfricanAmerican
4th rowCaucasian
5th rowCaucasian

Common Values

ValueCountFrequency (%)
Caucasian 76099
74.8%
AfricanAmerican 19210
 
18.9%
? 2273
 
2.2%
Hispanic 2037
 
2.0%
Other 1506
 
1.5%
Asian 641
 
0.6%

Length

2025-09-12T12:57:50.546745image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-09-12T12:57:50.632443image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
caucasian 76099
74.8%
africanamerican 19210
 
18.9%
2273
 
2.2%
hispanic 2037
 
2.0%
other 1506
 
1.5%
asian 641
 
0.6%

Most occurring characters

ValueCountFrequency (%)
a 269395
26.9%
i 119234
11.9%
n 117197
11.7%
c 116556
11.6%
s 78777
 
7.9%
C 76099
 
7.6%
u 76099
 
7.6%
r 39926
 
4.0%
A 39061
 
3.9%
e 20716
 
2.1%
Other values (8) 49285
 
4.9%

Most occurring categories

ValueCountFrequency (%)
(unknown) 1002345
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
a 269395
26.9%
i 119234
11.9%
n 117197
11.7%
c 116556
11.6%
s 78777
 
7.9%
C 76099
 
7.6%
u 76099
 
7.6%
r 39926
 
4.0%
A 39061
 
3.9%
e 20716
 
2.1%
Other values (8) 49285
 
4.9%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 1002345
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
a 269395
26.9%
i 119234
11.9%
n 117197
11.7%
c 116556
11.6%
s 78777
 
7.9%
C 76099
 
7.6%
u 76099
 
7.6%
r 39926
 
4.0%
A 39061
 
3.9%
e 20716
 
2.1%
Other values (8) 49285
 
4.9%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 1002345
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
a 269395
26.9%
i 119234
11.9%
n 117197
11.7%
c 116556
11.6%
s 78777
 
7.9%
C 76099
 
7.6%
u 76099
 
7.6%
r 39926
 
4.0%
A 39061
 
3.9%
e 20716
 
2.1%
Other values (8) 49285
 
4.9%

gender
Categorical

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size5.2 MiB
Female
54708 
Male
47055 
Unknown/Invalid
 
3

Length

Max length15
Median length6
Mean length5.0754967
Min length4

Characters and Unicode

Total characters516513
Distinct characters16
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowFemale
2nd rowFemale
3rd rowFemale
4th rowMale
5th rowMale

Common Values

ValueCountFrequency (%)
Female 54708
53.8%
Male 47055
46.2%
Unknown/Invalid 3
 
< 0.1%

Length

2025-09-12T12:57:50.722476image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-09-12T12:57:50.782044image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
female 54708
53.8%
male 47055
46.2%
unknown/invalid 3
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
e 156471
30.3%
a 101766
19.7%
l 101766
19.7%
F 54708
 
10.6%
m 54708
 
10.6%
M 47055
 
9.1%
n 12
 
< 0.1%
U 3
 
< 0.1%
k 3
 
< 0.1%
o 3
 
< 0.1%
Other values (6) 18
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 516513
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e 156471
30.3%
a 101766
19.7%
l 101766
19.7%
F 54708
 
10.6%
m 54708
 
10.6%
M 47055
 
9.1%
n 12
 
< 0.1%
U 3
 
< 0.1%
k 3
 
< 0.1%
o 3
 
< 0.1%
Other values (6) 18
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 516513
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e 156471
30.3%
a 101766
19.7%
l 101766
19.7%
F 54708
 
10.6%
m 54708
 
10.6%
M 47055
 
9.1%
n 12
 
< 0.1%
U 3
 
< 0.1%
k 3
 
< 0.1%
o 3
 
< 0.1%
Other values (6) 18
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 516513
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e 156471
30.3%
a 101766
19.7%
l 101766
19.7%
F 54708
 
10.6%
m 54708
 
10.6%
M 47055
 
9.1%
n 12
 
< 0.1%
U 3
 
< 0.1%
k 3
 
< 0.1%
o 3
 
< 0.1%
Other values (6) 18
 
< 0.1%

age
Categorical

Distinct10
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size5.4 MiB
[70-80)
26068 
[60-70)
22483 
[50-60)
17256 
[80-90)
17197 
[40-50)
9685 
Other values (5)
9077 

Length

Max length8
Median length7
Mean length7.0258633
Min length6

Characters and Unicode

Total characters714994
Distinct characters13
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row[0-10)
2nd row[10-20)
3rd row[20-30)
4th row[30-40)
5th row[40-50)

Common Values

ValueCountFrequency (%)
[70-80) 26068
25.6%
[60-70) 22483
22.1%
[50-60) 17256
17.0%
[80-90) 17197
16.9%
[40-50) 9685
 
9.5%
[30-40) 3775
 
3.7%
[90-100) 2793
 
2.7%
[20-30) 1657
 
1.6%
[10-20) 691
 
0.7%
[0-10) 161
 
0.2%

Length

2025-09-12T12:57:50.862748image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-09-12T12:57:50.960414image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
70-80 26068
25.6%
60-70 22483
22.1%
50-60 17256
17.0%
80-90 17197
16.9%
40-50 9685
 
9.5%
30-40 3775
 
3.7%
90-100 2793
 
2.7%
20-30 1657
 
1.6%
10-20 691
 
0.7%
0-10 161
 
0.2%

Most occurring characters

ValueCountFrequency (%)
0 206325
28.9%
[ 101766
14.2%
- 101766
14.2%
) 101766
14.2%
7 48551
 
6.8%
8 43265
 
6.1%
6 39739
 
5.6%
5 26941
 
3.8%
9 19990
 
2.8%
4 13460
 
1.9%
Other values (3) 11425
 
1.6%

Most occurring categories

ValueCountFrequency (%)
(unknown) 714994
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
0 206325
28.9%
[ 101766
14.2%
- 101766
14.2%
) 101766
14.2%
7 48551
 
6.8%
8 43265
 
6.1%
6 39739
 
5.6%
5 26941
 
3.8%
9 19990
 
2.8%
4 13460
 
1.9%
Other values (3) 11425
 
1.6%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 714994
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
0 206325
28.9%
[ 101766
14.2%
- 101766
14.2%
) 101766
14.2%
7 48551
 
6.8%
8 43265
 
6.1%
6 39739
 
5.6%
5 26941
 
3.8%
9 19990
 
2.8%
4 13460
 
1.9%
Other values (3) 11425
 
1.6%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 714994
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
0 206325
28.9%
[ 101766
14.2%
- 101766
14.2%
) 101766
14.2%
7 48551
 
6.8%
8 43265
 
6.1%
6 39739
 
5.6%
5 26941
 
3.8%
9 19990
 
2.8%
4 13460
 
1.9%
Other values (3) 11425
 
1.6%

weight
Categorical

High correlation  Imbalance 

Distinct10
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.9 MiB
?
98569 
[75-100)
 
1336
[50-75)
 
897
[100-125)
 
625
[125-150)
 
145
Other values (5)
 
194

Length

Max length9
Median length1
Mean length1.2170961
Min length1

Characters and Unicode

Total characters123859
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row?
2nd row?
3rd row?
4th row?
5th row?

Common Values

ValueCountFrequency (%)
? 98569
96.9%
[75-100) 1336
 
1.3%
[50-75) 897
 
0.9%
[100-125) 625
 
0.6%
[125-150) 145
 
0.1%
[25-50) 97
 
0.1%
[0-25) 48
 
< 0.1%
[150-175) 35
 
< 0.1%
[175-200) 11
 
< 0.1%
>200 3
 
< 0.1%

Length

2025-09-12T12:57:51.206637image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-09-12T12:57:51.288479image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
98569
96.9%
75-100 1336
 
1.3%
50-75 897
 
0.9%
100-125 625
 
0.6%
125-150 145
 
0.1%
25-50 97
 
0.1%
0-25 48
 
< 0.1%
150-175 35
 
< 0.1%
175-200 11
 
< 0.1%
200 3
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
? 98569
79.6%
0 5172
 
4.2%
5 4368
 
3.5%
[ 3194
 
2.6%
- 3194
 
2.6%
) 3194
 
2.6%
1 2957
 
2.4%
7 2279
 
1.8%
2 929
 
0.8%
> 3
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 123859
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
? 98569
79.6%
0 5172
 
4.2%
5 4368
 
3.5%
[ 3194
 
2.6%
- 3194
 
2.6%
) 3194
 
2.6%
1 2957
 
2.4%
7 2279
 
1.8%
2 929
 
0.8%
> 3
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 123859
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
? 98569
79.6%
0 5172
 
4.2%
5 4368
 
3.5%
[ 3194
 
2.6%
- 3194
 
2.6%
) 3194
 
2.6%
1 2957
 
2.4%
7 2279
 
1.8%
2 929
 
0.8%
> 3
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 123859
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
? 98569
79.6%
0 5172
 
4.2%
5 4368
 
3.5%
[ 3194
 
2.6%
- 3194
 
2.6%
) 3194
 
2.6%
1 2957
 
2.4%
7 2279
 
1.8%
2 929
 
0.8%
> 3
 
< 0.1%

admission_type_id
Real number (ℝ)

Distinct8
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.0240061
Minimum1
Maximum8
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size795.2 KiB
2025-09-12T12:57:51.375328image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median1
Q33
95-th percentile6
Maximum8
Range7
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.4454028
Coefficient of variation (CV)0.7141297
Kurtosis1.9424761
Mean2.0240061
Median Absolute Deviation (MAD)0
Skewness1.5919843
Sum205975
Variance2.0891893
MonotonicityNot monotonic
2025-09-12T12:57:51.441170image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=8)
ValueCountFrequency (%)
1 53990
53.1%
3 18869
 
18.5%
2 18480
 
18.2%
6 5291
 
5.2%
5 4785
 
4.7%
8 320
 
0.3%
7 21
 
< 0.1%
4 10
 
< 0.1%
ValueCountFrequency (%)
1 53990
53.1%
2 18480
 
18.2%
3 18869
 
18.5%
4 10
 
< 0.1%
5 4785
 
4.7%
6 5291
 
5.2%
7 21
 
< 0.1%
8 320
 
0.3%
ValueCountFrequency (%)
8 320
 
0.3%
7 21
 
< 0.1%
6 5291
 
5.2%
5 4785
 
4.7%
4 10
 
< 0.1%
3 18869
 
18.5%
2 18480
 
18.2%
1 53990
53.1%

discharge_disposition_id
Real number (ℝ)

Distinct26
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.7156418
Minimum1
Maximum28
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size795.2 KiB
2025-09-12T12:57:51.528625image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median1
Q34
95-th percentile18
Maximum28
Range27
Interquartile range (IQR)3

Descriptive statistics

Standard deviation5.2801655
Coefficient of variation (CV)1.4210642
Kurtosis6.0033468
Mean3.7156418
Median Absolute Deviation (MAD)0
Skewness2.563067
Sum378126
Variance27.880148
MonotonicityNot monotonic
2025-09-12T12:57:51.624439image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=26)
ValueCountFrequency (%)
1 60234
59.2%
3 13954
 
13.7%
6 12902
 
12.7%
18 3691
 
3.6%
2 2128
 
2.1%
22 1993
 
2.0%
11 1642
 
1.6%
5 1184
 
1.2%
25 989
 
1.0%
4 815
 
0.8%
Other values (16) 2234
 
2.2%
ValueCountFrequency (%)
1 60234
59.2%
2 2128
 
2.1%
3 13954
 
13.7%
4 815
 
0.8%
5 1184
 
1.2%
6 12902
 
12.7%
7 623
 
0.6%
8 108
 
0.1%
9 21
 
< 0.1%
10 6
 
< 0.1%
ValueCountFrequency (%)
28 139
 
0.1%
27 5
 
< 0.1%
25 989
 
1.0%
24 48
 
< 0.1%
23 412
 
0.4%
22 1993
2.0%
20 2
 
< 0.1%
19 8
 
< 0.1%
18 3691
3.6%
17 14
 
< 0.1%

admission_source_id
Real number (ℝ)

Distinct17
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.7544366
Minimum1
Maximum25
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size795.2 KiB
2025-09-12T12:57:51.709900image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median7
Q37
95-th percentile17
Maximum25
Range24
Interquartile range (IQR)6

Descriptive statistics

Standard deviation4.0640808
Coefficient of variation (CV)0.70625173
Kurtosis1.7449894
Mean5.7544366
Median Absolute Deviation (MAD)0
Skewness1.0299349
Sum585606
Variance16.516753
MonotonicityNot monotonic
2025-09-12T12:57:51.789435image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=17)
ValueCountFrequency (%)
7 57494
56.5%
1 29565
29.1%
17 6781
 
6.7%
4 3187
 
3.1%
6 2264
 
2.2%
2 1104
 
1.1%
5 855
 
0.8%
3 187
 
0.2%
20 161
 
0.2%
9 125
 
0.1%
Other values (7) 43
 
< 0.1%
ValueCountFrequency (%)
1 29565
29.1%
2 1104
 
1.1%
3 187
 
0.2%
4 3187
 
3.1%
5 855
 
0.8%
6 2264
 
2.2%
7 57494
56.5%
8 16
 
< 0.1%
9 125
 
0.1%
10 8
 
< 0.1%
ValueCountFrequency (%)
25 2
 
< 0.1%
22 12
 
< 0.1%
20 161
 
0.2%
17 6781
6.7%
14 2
 
< 0.1%
13 1
 
< 0.1%
11 2
 
< 0.1%
10 8
 
< 0.1%
9 125
 
0.1%
8 16
 
< 0.1%

time_in_hospital
Real number (ℝ)

Distinct14
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.3959869
Minimum1
Maximum14
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size795.2 KiB
2025-09-12T12:57:51.867133image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q12
median4
Q36
95-th percentile11
Maximum14
Range13
Interquartile range (IQR)4

Descriptive statistics

Standard deviation2.9851078
Coefficient of variation (CV)0.67905293
Kurtosis0.85025084
Mean4.3959869
Median Absolute Deviation (MAD)2
Skewness1.1339987
Sum447362
Variance8.9108684
MonotonicityNot monotonic
2025-09-12T12:57:51.951336image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=14)
ValueCountFrequency (%)
3 17756
17.4%
2 17224
16.9%
1 14208
14.0%
4 13924
13.7%
5 9966
9.8%
6 7539
7.4%
7 5859
 
5.8%
8 4391
 
4.3%
9 3002
 
2.9%
10 2342
 
2.3%
Other values (4) 5555
 
5.5%
ValueCountFrequency (%)
1 14208
14.0%
2 17224
16.9%
3 17756
17.4%
4 13924
13.7%
5 9966
9.8%
6 7539
7.4%
7 5859
 
5.8%
8 4391
 
4.3%
9 3002
 
2.9%
10 2342
 
2.3%
ValueCountFrequency (%)
14 1042
 
1.0%
13 1210
 
1.2%
12 1448
 
1.4%
11 1855
 
1.8%
10 2342
 
2.3%
9 3002
 
2.9%
8 4391
4.3%
7 5859
5.8%
6 7539
7.4%
5 9966
9.8%

payer_code
Categorical

Distinct18
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.9 MiB
?
40256 
MC
32439 
HM
6274 
SP
5007 
BC
4655 
Other values (13)
13135 

Length

Max length2
Median length2
Mean length1.6044258
Min length1

Characters and Unicode

Total characters163276
Distinct characters17
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row?
2nd row?
3rd row?
4th row?
5th row?

Common Values

ValueCountFrequency (%)
? 40256
39.6%
MC 32439
31.9%
HM 6274
 
6.2%
SP 5007
 
4.9%
BC 4655
 
4.6%
MD 3532
 
3.5%
CP 2533
 
2.5%
UN 2448
 
2.4%
CM 1937
 
1.9%
OG 1033
 
1.0%
Other values (8) 1652
 
1.6%

Length

2025-09-12T12:57:52.057158image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
40256
39.6%
mc 32439
31.9%
hm 6274
 
6.2%
sp 5007
 
4.9%
bc 4655
 
4.6%
md 3532
 
3.5%
cp 2533
 
2.5%
un 2448
 
2.4%
cm 1937
 
1.9%
og 1033
 
1.0%
Other values (8) 1652
 
1.6%

Most occurring characters

ValueCountFrequency (%)
M 44810
27.4%
C 41845
25.6%
? 40256
24.7%
P 8211
 
5.0%
H 6420
 
3.9%
S 5062
 
3.1%
B 4655
 
2.9%
D 4081
 
2.5%
N 2448
 
1.5%
U 2448
 
1.5%
Other values (7) 3040
 
1.9%

Most occurring categories

ValueCountFrequency (%)
(unknown) 163276
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
M 44810
27.4%
C 41845
25.6%
? 40256
24.7%
P 8211
 
5.0%
H 6420
 
3.9%
S 5062
 
3.1%
B 4655
 
2.9%
D 4081
 
2.5%
N 2448
 
1.5%
U 2448
 
1.5%
Other values (7) 3040
 
1.9%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 163276
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
M 44810
27.4%
C 41845
25.6%
? 40256
24.7%
P 8211
 
5.0%
H 6420
 
3.9%
S 5062
 
3.1%
B 4655
 
2.9%
D 4081
 
2.5%
N 2448
 
1.5%
U 2448
 
1.5%
Other values (7) 3040
 
1.9%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 163276
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
M 44810
27.4%
C 41845
25.6%
? 40256
24.7%
P 8211
 
5.0%
H 6420
 
3.9%
S 5062
 
3.1%
B 4655
 
2.9%
D 4081
 
2.5%
N 2448
 
1.5%
U 2448
 
1.5%
Other values (7) 3040
 
1.9%
Distinct73
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size5.6 MiB
2025-09-12T12:57:52.199299image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Length

Max length36
Median length33
Mean length8.6126702
Min length1

Characters and Unicode

Total characters876477
Distinct characters44
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique9 ?
Unique (%)< 0.1%

Sample

1st rowPediatrics-Endocrinology
2nd row?
3rd row?
4th row?
5th row?
ValueCountFrequency (%)
49949
49.1%
internalmedicine 14635
 
14.4%
emergency/trauma 7565
 
7.4%
family/generalpractice 7440
 
7.3%
cardiology 5352
 
5.3%
surgery-general 3099
 
3.0%
nephrology 1613
 
1.6%
orthopedics 1400
 
1.4%
orthopedics-reconstructive 1233
 
1.2%
radiologist 1140
 
1.1%
Other values (63) 8340
 
8.2%
2025-09-12T12:57:52.483670image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 105151
 
12.0%
r 76899
 
8.8%
a 71149
 
8.1%
n 68798
 
7.8%
i 63308
 
7.2%
c 50007
 
5.7%
? 49949
 
5.7%
l 48871
 
5.6%
y 34937
 
4.0%
t 34149
 
3.9%
Other values (34) 273259
31.2%

Most occurring categories

ValueCountFrequency (%)
(unknown) 876477
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e 105151
 
12.0%
r 76899
 
8.8%
a 71149
 
8.1%
n 68798
 
7.8%
i 63308
 
7.2%
c 50007
 
5.7%
? 49949
 
5.7%
l 48871
 
5.6%
y 34937
 
4.0%
t 34149
 
3.9%
Other values (34) 273259
31.2%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 876477
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e 105151
 
12.0%
r 76899
 
8.8%
a 71149
 
8.1%
n 68798
 
7.8%
i 63308
 
7.2%
c 50007
 
5.7%
? 49949
 
5.7%
l 48871
 
5.6%
y 34937
 
4.0%
t 34149
 
3.9%
Other values (34) 273259
31.2%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 876477
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e 105151
 
12.0%
r 76899
 
8.8%
a 71149
 
8.1%
n 68798
 
7.8%
i 63308
 
7.2%
c 50007
 
5.7%
? 49949
 
5.7%
l 48871
 
5.6%
y 34937
 
4.0%
t 34149
 
3.9%
Other values (34) 273259
31.2%

num_lab_procedures
Real number (ℝ)

Distinct118
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean43.095641
Minimum1
Maximum132
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size795.2 KiB
2025-09-12T12:57:52.593406image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile4
Q131
median44
Q357
95-th percentile73
Maximum132
Range131
Interquartile range (IQR)26

Descriptive statistics

Standard deviation19.674362
Coefficient of variation (CV)0.45652789
Kurtosis-0.24507352
Mean43.095641
Median Absolute Deviation (MAD)13
Skewness-0.23654392
Sum4385671
Variance387.08053
MonotonicityNot monotonic
2025-09-12T12:57:52.714167image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 3208
 
3.2%
43 2804
 
2.8%
44 2496
 
2.5%
45 2376
 
2.3%
38 2213
 
2.2%
40 2201
 
2.2%
46 2189
 
2.2%
41 2117
 
2.1%
42 2113
 
2.1%
47 2106
 
2.1%
Other values (108) 77943
76.6%
ValueCountFrequency (%)
1 3208
3.2%
2 1101
 
1.1%
3 668
 
0.7%
4 378
 
0.4%
5 286
 
0.3%
6 282
 
0.3%
7 323
 
0.3%
8 366
 
0.4%
9 933
 
0.9%
10 838
 
0.8%
ValueCountFrequency (%)
132 1
 
< 0.1%
129 1
 
< 0.1%
126 1
 
< 0.1%
121 1
 
< 0.1%
120 1
 
< 0.1%
118 1
 
< 0.1%
114 2
< 0.1%
113 3
< 0.1%
111 3
< 0.1%
109 4
< 0.1%

num_procedures
Real number (ℝ)

Zeros 

Distinct7
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.3397304
Minimum0
Maximum6
Zeros46652
Zeros (%)45.8%
Negative0
Negative (%)0.0%
Memory size795.2 KiB
2025-09-12T12:57:52.795437image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median1
Q32
95-th percentile5
Maximum6
Range6
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.705807
Coefficient of variation (CV)1.2732465
Kurtosis0.8571103
Mean1.3397304
Median Absolute Deviation (MAD)1
Skewness1.3164148
Sum136339
Variance2.9097775
MonotonicityNot monotonic
2025-09-12T12:57:52.858060image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
0 46652
45.8%
1 20742
20.4%
2 12717
 
12.5%
3 9443
 
9.3%
6 4954
 
4.9%
4 4180
 
4.1%
5 3078
 
3.0%
ValueCountFrequency (%)
0 46652
45.8%
1 20742
20.4%
2 12717
 
12.5%
3 9443
 
9.3%
4 4180
 
4.1%
5 3078
 
3.0%
6 4954
 
4.9%
ValueCountFrequency (%)
6 4954
 
4.9%
5 3078
 
3.0%
4 4180
 
4.1%
3 9443
 
9.3%
2 12717
 
12.5%
1 20742
20.4%
0 46652
45.8%

num_medications
Real number (ℝ)

Distinct75
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean16.021844
Minimum1
Maximum81
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size795.2 KiB
2025-09-12T12:57:52.951239image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile6
Q110
median15
Q320
95-th percentile31
Maximum81
Range80
Interquartile range (IQR)10

Descriptive statistics

Standard deviation8.1275662
Coefficient of variation (CV)0.50728032
Kurtosis3.4681549
Mean16.021844
Median Absolute Deviation (MAD)5
Skewness1.3266721
Sum1630479
Variance66.057332
MonotonicityNot monotonic
2025-09-12T12:57:53.074238image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
13 6086
 
6.0%
12 6004
 
5.9%
11 5795
 
5.7%
15 5792
 
5.7%
14 5707
 
5.6%
16 5430
 
5.3%
10 5346
 
5.3%
17 4919
 
4.8%
9 4913
 
4.8%
18 4523
 
4.4%
Other values (65) 47251
46.4%
ValueCountFrequency (%)
1 262
 
0.3%
2 470
 
0.5%
3 900
 
0.9%
4 1417
 
1.4%
5 2017
 
2.0%
6 2699
2.7%
7 3484
3.4%
8 4353
4.3%
9 4913
4.8%
10 5346
5.3%
ValueCountFrequency (%)
81 1
 
< 0.1%
79 1
 
< 0.1%
75 2
 
< 0.1%
74 1
 
< 0.1%
72 3
< 0.1%
70 2
 
< 0.1%
69 5
< 0.1%
68 7
< 0.1%
67 7
< 0.1%
66 5
< 0.1%

number_outpatient
Real number (ℝ)

Zeros 

Distinct39
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.36935715
Minimum0
Maximum42
Zeros85027
Zeros (%)83.6%
Negative0
Negative (%)0.0%
Memory size795.2 KiB
2025-09-12T12:57:53.186258image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile2
Maximum42
Range42
Interquartile range (IQR)0

Descriptive statistics

Standard deviation1.2672651
Coefficient of variation (CV)3.4310019
Kurtosis147.90774
Mean0.36935715
Median Absolute Deviation (MAD)0
Skewness8.8329589
Sum37588
Variance1.6059608
MonotonicityNot monotonic
2025-09-12T12:57:53.299994image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=39)
ValueCountFrequency (%)
0 85027
83.6%
1 8547
 
8.4%
2 3594
 
3.5%
3 2042
 
2.0%
4 1099
 
1.1%
5 533
 
0.5%
6 303
 
0.3%
7 155
 
0.2%
8 98
 
0.1%
9 83
 
0.1%
Other values (29) 285
 
0.3%
ValueCountFrequency (%)
0 85027
83.6%
1 8547
 
8.4%
2 3594
 
3.5%
3 2042
 
2.0%
4 1099
 
1.1%
5 533
 
0.5%
6 303
 
0.3%
7 155
 
0.2%
8 98
 
0.1%
9 83
 
0.1%
ValueCountFrequency (%)
42 1
< 0.1%
40 1
< 0.1%
39 1
< 0.1%
38 1
< 0.1%
37 1
< 0.1%
36 2
< 0.1%
35 2
< 0.1%
34 1
< 0.1%
33 2
< 0.1%
29 2
< 0.1%

number_emergency
Real number (ℝ)

Skewed  Zeros 

Distinct33
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.19783621
Minimum0
Maximum76
Zeros90383
Zeros (%)88.8%
Negative0
Negative (%)0.0%
Memory size795.2 KiB
2025-09-12T12:57:53.396567image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile1
Maximum76
Range76
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.93047227
Coefficient of variation (CV)4.7032455
Kurtosis1191.6867
Mean0.19783621
Median Absolute Deviation (MAD)0
Skewness22.855582
Sum20133
Variance0.86577864
MonotonicityNot monotonic
2025-09-12T12:57:53.492414image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=33)
ValueCountFrequency (%)
0 90383
88.8%
1 7677
 
7.5%
2 2042
 
2.0%
3 725
 
0.7%
4 374
 
0.4%
5 192
 
0.2%
6 94
 
0.1%
7 73
 
0.1%
8 50
 
< 0.1%
10 34
 
< 0.1%
Other values (23) 122
 
0.1%
ValueCountFrequency (%)
0 90383
88.8%
1 7677
 
7.5%
2 2042
 
2.0%
3 725
 
0.7%
4 374
 
0.4%
5 192
 
0.2%
6 94
 
0.1%
7 73
 
0.1%
8 50
 
< 0.1%
9 33
 
< 0.1%
ValueCountFrequency (%)
76 1
< 0.1%
64 1
< 0.1%
63 1
< 0.1%
54 1
< 0.1%
46 1
< 0.1%
42 1
< 0.1%
37 1
< 0.1%
29 1
< 0.1%
28 1
< 0.1%
25 2
< 0.1%

number_inpatient
Real number (ℝ)

Zeros 

Distinct21
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.63556591
Minimum0
Maximum21
Zeros67630
Zeros (%)66.5%
Negative0
Negative (%)0.0%
Memory size795.2 KiB
2025-09-12T12:57:53.580687image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q31
95-th percentile3
Maximum21
Range21
Interquartile range (IQR)1

Descriptive statistics

Standard deviation1.2628633
Coefficient of variation (CV)1.9869903
Kurtosis20.719397
Mean0.63556591
Median Absolute Deviation (MAD)0
Skewness3.614139
Sum64679
Variance1.5948237
MonotonicityNot monotonic
2025-09-12T12:57:53.665719image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=21)
ValueCountFrequency (%)
0 67630
66.5%
1 19521
 
19.2%
2 7566
 
7.4%
3 3411
 
3.4%
4 1622
 
1.6%
5 812
 
0.8%
6 480
 
0.5%
7 268
 
0.3%
8 151
 
0.1%
9 111
 
0.1%
Other values (11) 194
 
0.2%
ValueCountFrequency (%)
0 67630
66.5%
1 19521
 
19.2%
2 7566
 
7.4%
3 3411
 
3.4%
4 1622
 
1.6%
5 812
 
0.8%
6 480
 
0.5%
7 268
 
0.3%
8 151
 
0.1%
9 111
 
0.1%
ValueCountFrequency (%)
21 1
 
< 0.1%
19 2
 
< 0.1%
18 1
 
< 0.1%
17 1
 
< 0.1%
16 6
 
< 0.1%
15 9
 
< 0.1%
14 10
 
< 0.1%
13 20
< 0.1%
12 34
< 0.1%
11 49
< 0.1%

diag_1
Text

Distinct717
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Memory size5.1 MiB
2025-09-12T12:57:54.000867image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Length

Max length6
Median length3
Mean length3.1752157
Min length1

Characters and Unicode

Total characters323129
Distinct characters14
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique82 ?
Unique (%)0.1%

Sample

1st row250.83
2nd row276
3rd row648
4th row8
5th row197
ValueCountFrequency (%)
428 6862
 
6.7%
414 6581
 
6.5%
786 4016
 
3.9%
410 3614
 
3.6%
486 3508
 
3.4%
427 2766
 
2.7%
491 2275
 
2.2%
715 2151
 
2.1%
682 2042
 
2.0%
434 2028
 
2.0%
Other values (707) 65923
64.8%
2025-09-12T12:57:54.445128image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
4 55457
17.2%
2 39876
12.3%
8 37949
11.7%
5 37131
11.5%
7 28668
8.9%
1 28106
8.7%
0 24960
7.7%
6 23198
7.2%
9 19978
 
6.2%
3 17618
 
5.5%
Other values (4) 10188
 
3.2%

Most occurring categories

ValueCountFrequency (%)
(unknown) 323129
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
4 55457
17.2%
2 39876
12.3%
8 37949
11.7%
5 37131
11.5%
7 28668
8.9%
1 28106
8.7%
0 24960
7.7%
6 23198
7.2%
9 19978
 
6.2%
3 17618
 
5.5%
Other values (4) 10188
 
3.2%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 323129
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
4 55457
17.2%
2 39876
12.3%
8 37949
11.7%
5 37131
11.5%
7 28668
8.9%
1 28106
8.7%
0 24960
7.7%
6 23198
7.2%
9 19978
 
6.2%
3 17618
 
5.5%
Other values (4) 10188
 
3.2%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 323129
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
4 55457
17.2%
2 39876
12.3%
8 37949
11.7%
5 37131
11.5%
7 28668
8.9%
1 28106
8.7%
0 24960
7.7%
6 23198
7.2%
9 19978
 
6.2%
3 17618
 
5.5%
Other values (4) 10188
 
3.2%

diag_2
Text

Distinct749
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Memory size5.1 MiB
2025-09-12T12:57:54.796216image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Length

Max length6
Median length3
Mean length3.166195
Min length1

Characters and Unicode

Total characters322211
Distinct characters14
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique124 ?
Unique (%)0.1%

Sample

1st row?
2nd row250.01
3rd row250
4th row250.43
5th row157
ValueCountFrequency (%)
276 6752
 
6.6%
428 6662
 
6.5%
250 6071
 
6.0%
427 5036
 
4.9%
401 3736
 
3.7%
496 3305
 
3.2%
599 3288
 
3.2%
403 2823
 
2.8%
414 2650
 
2.6%
411 2566
 
2.5%
Other values (739) 58877
57.9%
2025-09-12T12:57:55.199877image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
4 51155
15.9%
2 49765
15.4%
5 38176
11.8%
0 34046
10.6%
8 28711
8.9%
7 28654
8.9%
1 26158
8.1%
9 21842
6.8%
6 19990
 
6.2%
3 14097
 
4.4%
Other values (4) 9617
 
3.0%

Most occurring categories

ValueCountFrequency (%)
(unknown) 322211
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
4 51155
15.9%
2 49765
15.4%
5 38176
11.8%
0 34046
10.6%
8 28711
8.9%
7 28654
8.9%
1 26158
8.1%
9 21842
6.8%
6 19990
 
6.2%
3 14097
 
4.4%
Other values (4) 9617
 
3.0%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 322211
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
4 51155
15.9%
2 49765
15.4%
5 38176
11.8%
0 34046
10.6%
8 28711
8.9%
7 28654
8.9%
1 26158
8.1%
9 21842
6.8%
6 19990
 
6.2%
3 14097
 
4.4%
Other values (4) 9617
 
3.0%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 322211
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
4 51155
15.9%
2 49765
15.4%
5 38176
11.8%
0 34046
10.6%
8 28711
8.9%
7 28654
8.9%
1 26158
8.1%
9 21842
6.8%
6 19990
 
6.2%
3 14097
 
4.4%
Other values (4) 9617
 
3.0%

diag_3
Text

Distinct790
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Memory size5.1 MiB
2025-09-12T12:57:55.498959image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Length

Max length6
Median length3
Mean length3.1116581
Min length1

Characters and Unicode

Total characters316661
Distinct characters14
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique122 ?
Unique (%)0.1%

Sample

1st row?
2nd row255
3rd rowV27
4th row403
5th row250
ValueCountFrequency (%)
250 11555
 
11.4%
401 8289
 
8.1%
276 5175
 
5.1%
428 4577
 
4.5%
427 3955
 
3.9%
414 3664
 
3.6%
496 2605
 
2.6%
403 2357
 
2.3%
585 1992
 
2.0%
272 1969
 
1.9%
Other values (780) 55628
54.7%
2025-09-12T12:57:55.893489image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2 51244
16.2%
4 49252
15.6%
5 41260
13.0%
0 39711
12.5%
7 26504
8.4%
1 24684
7.8%
8 23825
7.5%
9 17323
 
5.5%
6 16441
 
5.2%
3 14333
 
4.5%
Other values (4) 12084
 
3.8%

Most occurring categories

ValueCountFrequency (%)
(unknown) 316661
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
2 51244
16.2%
4 49252
15.6%
5 41260
13.0%
0 39711
12.5%
7 26504
8.4%
1 24684
7.8%
8 23825
7.5%
9 17323
 
5.5%
6 16441
 
5.2%
3 14333
 
4.5%
Other values (4) 12084
 
3.8%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 316661
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
2 51244
16.2%
4 49252
15.6%
5 41260
13.0%
0 39711
12.5%
7 26504
8.4%
1 24684
7.8%
8 23825
7.5%
9 17323
 
5.5%
6 16441
 
5.2%
3 14333
 
4.5%
Other values (4) 12084
 
3.8%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 316661
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
2 51244
16.2%
4 49252
15.6%
5 41260
13.0%
0 39711
12.5%
7 26504
8.4%
1 24684
7.8%
8 23825
7.5%
9 17323
 
5.5%
6 16441
 
5.2%
3 14333
 
4.5%
Other values (4) 12084
 
3.8%

number_diagnoses
Real number (ℝ)

Distinct16
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean7.4226068
Minimum1
Maximum16
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size795.2 KiB
2025-09-12T12:57:55.969977image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile4
Q16
median8
Q39
95-th percentile9
Maximum16
Range15
Interquartile range (IQR)3

Descriptive statistics

Standard deviation1.9336001
Coefficient of variation (CV)0.26050149
Kurtosis-0.079056024
Mean7.4226068
Median Absolute Deviation (MAD)1
Skewness-0.87674624
Sum755369
Variance3.7388095
MonotonicityNot monotonic
2025-09-12T12:57:56.053066image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=16)
ValueCountFrequency (%)
9 49474
48.6%
5 11393
 
11.2%
8 10616
 
10.4%
7 10393
 
10.2%
6 10161
 
10.0%
4 5537
 
5.4%
3 2835
 
2.8%
2 1023
 
1.0%
1 219
 
0.2%
16 45
 
< 0.1%
Other values (6) 70
 
0.1%
ValueCountFrequency (%)
1 219
 
0.2%
2 1023
 
1.0%
3 2835
 
2.8%
4 5537
 
5.4%
5 11393
 
11.2%
6 10161
 
10.0%
7 10393
 
10.2%
8 10616
 
10.4%
9 49474
48.6%
10 17
 
< 0.1%
ValueCountFrequency (%)
16 45
 
< 0.1%
15 10
 
< 0.1%
14 7
 
< 0.1%
13 16
 
< 0.1%
12 9
 
< 0.1%
11 11
 
< 0.1%
10 17
 
< 0.1%
9 49474
48.6%
8 10616
 
10.4%
7 10393
 
10.2%

max_glu_serum
Categorical

High correlation  Missing 

Distinct3
Distinct (%)0.1%
Missing96420
Missing (%)94.7%
Memory size5.4 MiB
Norm
2597 
>200
1485 
>300
1264 

Length

Max length4
Median length4
Mean length4
Min length4

Characters and Unicode

Total characters21384
Distinct characters8
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row>300
2nd row>300
3rd rowNorm
4th rowNorm
5th rowNorm

Common Values

ValueCountFrequency (%)
Norm 2597
 
2.6%
>200 1485
 
1.5%
>300 1264
 
1.2%
(Missing) 96420
94.7%

Length

2025-09-12T12:57:56.146958image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-09-12T12:57:56.214134image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
norm 2597
48.6%
200 1485
27.8%
300 1264
23.6%

Most occurring characters

ValueCountFrequency (%)
0 5498
25.7%
> 2749
12.9%
N 2597
12.1%
o 2597
12.1%
r 2597
12.1%
m 2597
12.1%
2 1485
 
6.9%
3 1264
 
5.9%

Most occurring categories

ValueCountFrequency (%)
(unknown) 21384
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
0 5498
25.7%
> 2749
12.9%
N 2597
12.1%
o 2597
12.1%
r 2597
12.1%
m 2597
12.1%
2 1485
 
6.9%
3 1264
 
5.9%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 21384
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
0 5498
25.7%
> 2749
12.9%
N 2597
12.1%
o 2597
12.1%
r 2597
12.1%
m 2597
12.1%
2 1485
 
6.9%
3 1264
 
5.9%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 21384
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
0 5498
25.7%
> 2749
12.9%
N 2597
12.1%
o 2597
12.1%
r 2597
12.1%
m 2597
12.1%
2 1485
 
6.9%
3 1264
 
5.9%

A1Cresult
Categorical

High correlation  Missing 

Distinct3
Distinct (%)< 0.1%
Missing84748
Missing (%)83.3%
Memory size5.4 MiB
>8
8216 
Norm
4990 
>7
3812 

Length

Max length4
Median length2
Mean length2.5864379
Min length2

Characters and Unicode

Total characters44016
Distinct characters7
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row>7
2nd row>7
3rd row>8
4th rowNorm
5th rowNorm

Common Values

ValueCountFrequency (%)
>8 8216
 
8.1%
Norm 4990
 
4.9%
>7 3812
 
3.7%
(Missing) 84748
83.3%

Length

2025-09-12T12:57:56.311884image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-09-12T12:57:56.390230image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
8 8216
48.3%
norm 4990
29.3%
7 3812
22.4%

Most occurring characters

ValueCountFrequency (%)
> 12028
27.3%
8 8216
18.7%
N 4990
11.3%
o 4990
11.3%
r 4990
11.3%
m 4990
11.3%
7 3812
 
8.7%

Most occurring categories

ValueCountFrequency (%)
(unknown) 44016
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
> 12028
27.3%
8 8216
18.7%
N 4990
11.3%
o 4990
11.3%
r 4990
11.3%
m 4990
11.3%
7 3812
 
8.7%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 44016
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
> 12028
27.3%
8 8216
18.7%
N 4990
11.3%
o 4990
11.3%
r 4990
11.3%
m 4990
11.3%
7 3812
 
8.7%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 44016
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
> 12028
27.3%
8 8216
18.7%
N 4990
11.3%
o 4990
11.3%
r 4990
11.3%
m 4990
11.3%
7 3812
 
8.7%

metformin
Categorical

Imbalance 

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size5.0 MiB
No
81778 
Steady
18346 
Up
 
1067
Down
 
575

Length

Max length6
Median length2
Mean length2.7324057
Min length2

Characters and Unicode

Total characters278066
Distinct characters13
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNo
2nd rowNo
3rd rowNo
4th rowNo
5th rowNo

Common Values

ValueCountFrequency (%)
No 81778
80.4%
Steady 18346
 
18.0%
Up 1067
 
1.0%
Down 575
 
0.6%

Length

2025-09-12T12:57:56.631832image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-09-12T12:57:56.698838image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
no 81778
80.4%
steady 18346
 
18.0%
up 1067
 
1.0%
down 575
 
0.6%

Most occurring characters

ValueCountFrequency (%)
o 82353
29.6%
N 81778
29.4%
S 18346
 
6.6%
t 18346
 
6.6%
e 18346
 
6.6%
a 18346
 
6.6%
d 18346
 
6.6%
y 18346
 
6.6%
U 1067
 
0.4%
p 1067
 
0.4%
Other values (3) 1725
 
0.6%

Most occurring categories

ValueCountFrequency (%)
(unknown) 278066
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
o 82353
29.6%
N 81778
29.4%
S 18346
 
6.6%
t 18346
 
6.6%
e 18346
 
6.6%
a 18346
 
6.6%
d 18346
 
6.6%
y 18346
 
6.6%
U 1067
 
0.4%
p 1067
 
0.4%
Other values (3) 1725
 
0.6%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 278066
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
o 82353
29.6%
N 81778
29.4%
S 18346
 
6.6%
t 18346
 
6.6%
e 18346
 
6.6%
a 18346
 
6.6%
d 18346
 
6.6%
y 18346
 
6.6%
U 1067
 
0.4%
p 1067
 
0.4%
Other values (3) 1725
 
0.6%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 278066
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
o 82353
29.6%
N 81778
29.4%
S 18346
 
6.6%
t 18346
 
6.6%
e 18346
 
6.6%
a 18346
 
6.6%
d 18346
 
6.6%
y 18346
 
6.6%
U 1067
 
0.4%
p 1067
 
0.4%
Other values (3) 1725
 
0.6%

repaglinide
Categorical

Imbalance 

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size5.0 MiB
No
100227 
Steady
 
1384
Up
 
110
Down
 
45

Length

Max length6
Median length2
Mean length2.0552837
Min length2

Characters and Unicode

Total characters209158
Distinct characters13
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNo
2nd rowNo
3rd rowNo
4th rowNo
5th rowNo

Common Values

ValueCountFrequency (%)
No 100227
98.5%
Steady 1384
 
1.4%
Up 110
 
0.1%
Down 45
 
< 0.1%

Length

2025-09-12T12:57:56.785439image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-09-12T12:57:56.852135image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
no 100227
98.5%
steady 1384
 
1.4%
up 110
 
0.1%
down 45
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
o 100272
47.9%
N 100227
47.9%
S 1384
 
0.7%
t 1384
 
0.7%
e 1384
 
0.7%
a 1384
 
0.7%
d 1384
 
0.7%
y 1384
 
0.7%
U 110
 
0.1%
p 110
 
0.1%
Other values (3) 135
 
0.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 209158
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
o 100272
47.9%
N 100227
47.9%
S 1384
 
0.7%
t 1384
 
0.7%
e 1384
 
0.7%
a 1384
 
0.7%
d 1384
 
0.7%
y 1384
 
0.7%
U 110
 
0.1%
p 110
 
0.1%
Other values (3) 135
 
0.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 209158
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
o 100272
47.9%
N 100227
47.9%
S 1384
 
0.7%
t 1384
 
0.7%
e 1384
 
0.7%
a 1384
 
0.7%
d 1384
 
0.7%
y 1384
 
0.7%
U 110
 
0.1%
p 110
 
0.1%
Other values (3) 135
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 209158
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
o 100272
47.9%
N 100227
47.9%
S 1384
 
0.7%
t 1384
 
0.7%
e 1384
 
0.7%
a 1384
 
0.7%
d 1384
 
0.7%
y 1384
 
0.7%
U 110
 
0.1%
p 110
 
0.1%
Other values (3) 135
 
0.1%

nateglinide
Categorical

Imbalance 

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size5.0 MiB
No
101063 
Steady
 
668
Up
 
24
Down
 
11

Length

Max length6
Median length2
Mean length2.0264725
Min length2

Characters and Unicode

Total characters206226
Distinct characters13
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNo
2nd rowNo
3rd rowNo
4th rowNo
5th rowNo

Common Values

ValueCountFrequency (%)
No 101063
99.3%
Steady 668
 
0.7%
Up 24
 
< 0.1%
Down 11
 
< 0.1%

Length

2025-09-12T12:57:56.940218image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-09-12T12:57:57.011025image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
no 101063
99.3%
steady 668
 
0.7%
up 24
 
< 0.1%
down 11
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
o 101074
49.0%
N 101063
49.0%
S 668
 
0.3%
t 668
 
0.3%
e 668
 
0.3%
a 668
 
0.3%
d 668
 
0.3%
y 668
 
0.3%
U 24
 
< 0.1%
p 24
 
< 0.1%
Other values (3) 33
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 206226
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
o 101074
49.0%
N 101063
49.0%
S 668
 
0.3%
t 668
 
0.3%
e 668
 
0.3%
a 668
 
0.3%
d 668
 
0.3%
y 668
 
0.3%
U 24
 
< 0.1%
p 24
 
< 0.1%
Other values (3) 33
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 206226
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
o 101074
49.0%
N 101063
49.0%
S 668
 
0.3%
t 668
 
0.3%
e 668
 
0.3%
a 668
 
0.3%
d 668
 
0.3%
y 668
 
0.3%
U 24
 
< 0.1%
p 24
 
< 0.1%
Other values (3) 33
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 206226
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
o 101074
49.0%
N 101063
49.0%
S 668
 
0.3%
t 668
 
0.3%
e 668
 
0.3%
a 668
 
0.3%
d 668
 
0.3%
y 668
 
0.3%
U 24
 
< 0.1%
p 24
 
< 0.1%
Other values (3) 33
 
< 0.1%

chlorpropamide
Categorical

Imbalance 

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size5.0 MiB
No
101680 
Steady
 
79
Up
 
6
Down
 
1

Length

Max length6
Median length2
Mean length2.0031248
Min length2

Characters and Unicode

Total characters203850
Distinct characters13
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowNo
2nd rowNo
3rd rowNo
4th rowNo
5th rowNo

Common Values

ValueCountFrequency (%)
No 101680
99.9%
Steady 79
 
0.1%
Up 6
 
< 0.1%
Down 1
 
< 0.1%

Length

2025-09-12T12:57:57.094979image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-09-12T12:57:57.165050image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
no 101680
99.9%
steady 79
 
0.1%
up 6
 
< 0.1%
down 1
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
o 101681
49.9%
N 101680
49.9%
S 79
 
< 0.1%
t 79
 
< 0.1%
e 79
 
< 0.1%
a 79
 
< 0.1%
d 79
 
< 0.1%
y 79
 
< 0.1%
U 6
 
< 0.1%
p 6
 
< 0.1%
Other values (3) 3
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 203850
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
o 101681
49.9%
N 101680
49.9%
S 79
 
< 0.1%
t 79
 
< 0.1%
e 79
 
< 0.1%
a 79
 
< 0.1%
d 79
 
< 0.1%
y 79
 
< 0.1%
U 6
 
< 0.1%
p 6
 
< 0.1%
Other values (3) 3
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 203850
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
o 101681
49.9%
N 101680
49.9%
S 79
 
< 0.1%
t 79
 
< 0.1%
e 79
 
< 0.1%
a 79
 
< 0.1%
d 79
 
< 0.1%
y 79
 
< 0.1%
U 6
 
< 0.1%
p 6
 
< 0.1%
Other values (3) 3
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 203850
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
o 101681
49.9%
N 101680
49.9%
S 79
 
< 0.1%
t 79
 
< 0.1%
e 79
 
< 0.1%
a 79
 
< 0.1%
d 79
 
< 0.1%
y 79
 
< 0.1%
U 6
 
< 0.1%
p 6
 
< 0.1%
Other values (3) 3
 
< 0.1%

glimepiride
Categorical

Imbalance 

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size5.0 MiB
No
96575 
Steady
 
4670
Up
 
327
Down
 
194

Length

Max length6
Median length2
Mean length2.187371
Min length2

Characters and Unicode

Total characters222600
Distinct characters13
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNo
2nd rowNo
3rd rowNo
4th rowNo
5th rowNo

Common Values

ValueCountFrequency (%)
No 96575
94.9%
Steady 4670
 
4.6%
Up 327
 
0.3%
Down 194
 
0.2%

Length

2025-09-12T12:57:57.244272image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-09-12T12:57:57.311929image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
no 96575
94.9%
steady 4670
 
4.6%
up 327
 
0.3%
down 194
 
0.2%

Most occurring characters

ValueCountFrequency (%)
o 96769
43.5%
N 96575
43.4%
S 4670
 
2.1%
t 4670
 
2.1%
e 4670
 
2.1%
a 4670
 
2.1%
d 4670
 
2.1%
y 4670
 
2.1%
U 327
 
0.1%
p 327
 
0.1%
Other values (3) 582
 
0.3%

Most occurring categories

ValueCountFrequency (%)
(unknown) 222600
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
o 96769
43.5%
N 96575
43.4%
S 4670
 
2.1%
t 4670
 
2.1%
e 4670
 
2.1%
a 4670
 
2.1%
d 4670
 
2.1%
y 4670
 
2.1%
U 327
 
0.1%
p 327
 
0.1%
Other values (3) 582
 
0.3%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 222600
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
o 96769
43.5%
N 96575
43.4%
S 4670
 
2.1%
t 4670
 
2.1%
e 4670
 
2.1%
a 4670
 
2.1%
d 4670
 
2.1%
y 4670
 
2.1%
U 327
 
0.1%
p 327
 
0.1%
Other values (3) 582
 
0.3%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 222600
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
o 96769
43.5%
N 96575
43.4%
S 4670
 
2.1%
t 4670
 
2.1%
e 4670
 
2.1%
a 4670
 
2.1%
d 4670
 
2.1%
y 4670
 
2.1%
U 327
 
0.1%
p 327
 
0.1%
Other values (3) 582
 
0.3%

acetohexamide
Categorical

High correlation  Imbalance 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.9 MiB
No
101765 
Steady
 
1

Length

Max length6
Median length2
Mean length2.0000393
Min length2

Characters and Unicode

Total characters203536
Distinct characters8
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowNo
2nd rowNo
3rd rowNo
4th rowNo
5th rowNo

Common Values

ValueCountFrequency (%)
No 101765
> 99.9%
Steady 1
 
< 0.1%

Length

2025-09-12T12:57:57.391251image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-09-12T12:57:57.451950image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
no 101765
> 99.9%
steady 1
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
N 101765
50.0%
o 101765
50.0%
S 1
 
< 0.1%
t 1
 
< 0.1%
e 1
 
< 0.1%
a 1
 
< 0.1%
d 1
 
< 0.1%
y 1
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 203536
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
N 101765
50.0%
o 101765
50.0%
S 1
 
< 0.1%
t 1
 
< 0.1%
e 1
 
< 0.1%
a 1
 
< 0.1%
d 1
 
< 0.1%
y 1
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 203536
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
N 101765
50.0%
o 101765
50.0%
S 1
 
< 0.1%
t 1
 
< 0.1%
e 1
 
< 0.1%
a 1
 
< 0.1%
d 1
 
< 0.1%
y 1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 203536
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
N 101765
50.0%
o 101765
50.0%
S 1
 
< 0.1%
t 1
 
< 0.1%
e 1
 
< 0.1%
a 1
 
< 0.1%
d 1
 
< 0.1%
y 1
 
< 0.1%

glipizide
Categorical

Imbalance 

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size5.0 MiB
No
89080 
Steady
11356 
Up
 
770
Down
 
560

Length

Max length6
Median length2
Mean length2.457363
Min length2

Characters and Unicode

Total characters250076
Distinct characters13
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNo
2nd rowNo
3rd rowSteady
4th rowNo
5th rowSteady

Common Values

ValueCountFrequency (%)
No 89080
87.5%
Steady 11356
 
11.2%
Up 770
 
0.8%
Down 560
 
0.6%

Length

2025-09-12T12:57:57.528806image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-09-12T12:57:57.602734image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
no 89080
87.5%
steady 11356
 
11.2%
up 770
 
0.8%
down 560
 
0.6%

Most occurring characters

ValueCountFrequency (%)
o 89640
35.8%
N 89080
35.6%
S 11356
 
4.5%
t 11356
 
4.5%
e 11356
 
4.5%
a 11356
 
4.5%
d 11356
 
4.5%
y 11356
 
4.5%
U 770
 
0.3%
p 770
 
0.3%
Other values (3) 1680
 
0.7%

Most occurring categories

ValueCountFrequency (%)
(unknown) 250076
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
o 89640
35.8%
N 89080
35.6%
S 11356
 
4.5%
t 11356
 
4.5%
e 11356
 
4.5%
a 11356
 
4.5%
d 11356
 
4.5%
y 11356
 
4.5%
U 770
 
0.3%
p 770
 
0.3%
Other values (3) 1680
 
0.7%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 250076
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
o 89640
35.8%
N 89080
35.6%
S 11356
 
4.5%
t 11356
 
4.5%
e 11356
 
4.5%
a 11356
 
4.5%
d 11356
 
4.5%
y 11356
 
4.5%
U 770
 
0.3%
p 770
 
0.3%
Other values (3) 1680
 
0.7%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 250076
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
o 89640
35.8%
N 89080
35.6%
S 11356
 
4.5%
t 11356
 
4.5%
e 11356
 
4.5%
a 11356
 
4.5%
d 11356
 
4.5%
y 11356
 
4.5%
U 770
 
0.3%
p 770
 
0.3%
Other values (3) 1680
 
0.7%

glyburide
Categorical

Imbalance 

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size5.0 MiB
No
91116 
Steady
9274 
Up
 
812
Down
 
564

Length

Max length6
Median length2
Mean length2.3756068
Min length2

Characters and Unicode

Total characters241756
Distinct characters13
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNo
2nd rowNo
3rd rowNo
4th rowNo
5th rowNo

Common Values

ValueCountFrequency (%)
No 91116
89.5%
Steady 9274
 
9.1%
Up 812
 
0.8%
Down 564
 
0.6%

Length

2025-09-12T12:57:57.687334image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-09-12T12:57:57.772610image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
no 91116
89.5%
steady 9274
 
9.1%
up 812
 
0.8%
down 564
 
0.6%

Most occurring characters

ValueCountFrequency (%)
o 91680
37.9%
N 91116
37.7%
S 9274
 
3.8%
t 9274
 
3.8%
e 9274
 
3.8%
a 9274
 
3.8%
d 9274
 
3.8%
y 9274
 
3.8%
U 812
 
0.3%
p 812
 
0.3%
Other values (3) 1692
 
0.7%

Most occurring categories

ValueCountFrequency (%)
(unknown) 241756
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
o 91680
37.9%
N 91116
37.7%
S 9274
 
3.8%
t 9274
 
3.8%
e 9274
 
3.8%
a 9274
 
3.8%
d 9274
 
3.8%
y 9274
 
3.8%
U 812
 
0.3%
p 812
 
0.3%
Other values (3) 1692
 
0.7%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 241756
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
o 91680
37.9%
N 91116
37.7%
S 9274
 
3.8%
t 9274
 
3.8%
e 9274
 
3.8%
a 9274
 
3.8%
d 9274
 
3.8%
y 9274
 
3.8%
U 812
 
0.3%
p 812
 
0.3%
Other values (3) 1692
 
0.7%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 241756
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
o 91680
37.9%
N 91116
37.7%
S 9274
 
3.8%
t 9274
 
3.8%
e 9274
 
3.8%
a 9274
 
3.8%
d 9274
 
3.8%
y 9274
 
3.8%
U 812
 
0.3%
p 812
 
0.3%
Other values (3) 1692
 
0.7%

tolbutamide
Categorical

Imbalance 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.9 MiB
No
101743 
Steady
 
23

Length

Max length6
Median length2
Mean length2.000904
Min length2

Characters and Unicode

Total characters203624
Distinct characters8
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNo
2nd rowNo
3rd rowNo
4th rowNo
5th rowNo

Common Values

ValueCountFrequency (%)
No 101743
> 99.9%
Steady 23
 
< 0.1%

Length

2025-09-12T12:57:57.852134image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-09-12T12:57:57.914578image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
no 101743
> 99.9%
steady 23
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
N 101743
50.0%
o 101743
50.0%
S 23
 
< 0.1%
t 23
 
< 0.1%
e 23
 
< 0.1%
a 23
 
< 0.1%
d 23
 
< 0.1%
y 23
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 203624
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
N 101743
50.0%
o 101743
50.0%
S 23
 
< 0.1%
t 23
 
< 0.1%
e 23
 
< 0.1%
a 23
 
< 0.1%
d 23
 
< 0.1%
y 23
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 203624
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
N 101743
50.0%
o 101743
50.0%
S 23
 
< 0.1%
t 23
 
< 0.1%
e 23
 
< 0.1%
a 23
 
< 0.1%
d 23
 
< 0.1%
y 23
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 203624
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
N 101743
50.0%
o 101743
50.0%
S 23
 
< 0.1%
t 23
 
< 0.1%
e 23
 
< 0.1%
a 23
 
< 0.1%
d 23
 
< 0.1%
y 23
 
< 0.1%

pioglitazone
Categorical

Imbalance 

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size5.0 MiB
No
94438 
Steady
 
6976
Up
 
234
Down
 
118

Length

Max length6
Median length2
Mean length2.2765167
Min length2

Characters and Unicode

Total characters231672
Distinct characters13
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNo
2nd rowNo
3rd rowNo
4th rowNo
5th rowNo

Common Values

ValueCountFrequency (%)
No 94438
92.8%
Steady 6976
 
6.9%
Up 234
 
0.2%
Down 118
 
0.1%

Length

2025-09-12T12:57:57.984324image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-09-12T12:57:58.048250image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
no 94438
92.8%
steady 6976
 
6.9%
up 234
 
0.2%
down 118
 
0.1%

Most occurring characters

ValueCountFrequency (%)
o 94556
40.8%
N 94438
40.8%
S 6976
 
3.0%
t 6976
 
3.0%
e 6976
 
3.0%
a 6976
 
3.0%
d 6976
 
3.0%
y 6976
 
3.0%
U 234
 
0.1%
p 234
 
0.1%
Other values (3) 354
 
0.2%

Most occurring categories

ValueCountFrequency (%)
(unknown) 231672
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
o 94556
40.8%
N 94438
40.8%
S 6976
 
3.0%
t 6976
 
3.0%
e 6976
 
3.0%
a 6976
 
3.0%
d 6976
 
3.0%
y 6976
 
3.0%
U 234
 
0.1%
p 234
 
0.1%
Other values (3) 354
 
0.2%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 231672
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
o 94556
40.8%
N 94438
40.8%
S 6976
 
3.0%
t 6976
 
3.0%
e 6976
 
3.0%
a 6976
 
3.0%
d 6976
 
3.0%
y 6976
 
3.0%
U 234
 
0.1%
p 234
 
0.1%
Other values (3) 354
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 231672
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
o 94556
40.8%
N 94438
40.8%
S 6976
 
3.0%
t 6976
 
3.0%
e 6976
 
3.0%
a 6976
 
3.0%
d 6976
 
3.0%
y 6976
 
3.0%
U 234
 
0.1%
p 234
 
0.1%
Other values (3) 354
 
0.2%

rosiglitazone
Categorical

Imbalance 

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size5.0 MiB
No
95401 
Steady
 
6100
Up
 
178
Down
 
87

Length

Max length6
Median length2
Mean length2.2414755
Min length2

Characters and Unicode

Total characters228106
Distinct characters13
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNo
2nd rowNo
3rd rowNo
4th rowNo
5th rowNo

Common Values

ValueCountFrequency (%)
No 95401
93.7%
Steady 6100
 
6.0%
Up 178
 
0.2%
Down 87
 
0.1%

Length

2025-09-12T12:57:58.129819image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-09-12T12:57:58.192825image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
no 95401
93.7%
steady 6100
 
6.0%
up 178
 
0.2%
down 87
 
0.1%

Most occurring characters

ValueCountFrequency (%)
o 95488
41.9%
N 95401
41.8%
S 6100
 
2.7%
t 6100
 
2.7%
e 6100
 
2.7%
a 6100
 
2.7%
d 6100
 
2.7%
y 6100
 
2.7%
U 178
 
0.1%
p 178
 
0.1%
Other values (3) 261
 
0.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 228106
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
o 95488
41.9%
N 95401
41.8%
S 6100
 
2.7%
t 6100
 
2.7%
e 6100
 
2.7%
a 6100
 
2.7%
d 6100
 
2.7%
y 6100
 
2.7%
U 178
 
0.1%
p 178
 
0.1%
Other values (3) 261
 
0.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 228106
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
o 95488
41.9%
N 95401
41.8%
S 6100
 
2.7%
t 6100
 
2.7%
e 6100
 
2.7%
a 6100
 
2.7%
d 6100
 
2.7%
y 6100
 
2.7%
U 178
 
0.1%
p 178
 
0.1%
Other values (3) 261
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 228106
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
o 95488
41.9%
N 95401
41.8%
S 6100
 
2.7%
t 6100
 
2.7%
e 6100
 
2.7%
a 6100
 
2.7%
d 6100
 
2.7%
y 6100
 
2.7%
U 178
 
0.1%
p 178
 
0.1%
Other values (3) 261
 
0.1%

acarbose
Categorical

Imbalance 

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size5.0 MiB
No
101458 
Steady
 
295
Up
 
10
Down
 
3

Length

Max length6
Median length2
Mean length2.0116542
Min length2

Characters and Unicode

Total characters204718
Distinct characters13
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNo
2nd rowNo
3rd rowNo
4th rowNo
5th rowNo

Common Values

ValueCountFrequency (%)
No 101458
99.7%
Steady 295
 
0.3%
Up 10
 
< 0.1%
Down 3
 
< 0.1%

Length

2025-09-12T12:57:58.273151image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-09-12T12:57:58.341344image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
no 101458
99.7%
steady 295
 
0.3%
up 10
 
< 0.1%
down 3
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
o 101461
49.6%
N 101458
49.6%
S 295
 
0.1%
t 295
 
0.1%
e 295
 
0.1%
a 295
 
0.1%
d 295
 
0.1%
y 295
 
0.1%
U 10
 
< 0.1%
p 10
 
< 0.1%
Other values (3) 9
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 204718
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
o 101461
49.6%
N 101458
49.6%
S 295
 
0.1%
t 295
 
0.1%
e 295
 
0.1%
a 295
 
0.1%
d 295
 
0.1%
y 295
 
0.1%
U 10
 
< 0.1%
p 10
 
< 0.1%
Other values (3) 9
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 204718
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
o 101461
49.6%
N 101458
49.6%
S 295
 
0.1%
t 295
 
0.1%
e 295
 
0.1%
a 295
 
0.1%
d 295
 
0.1%
y 295
 
0.1%
U 10
 
< 0.1%
p 10
 
< 0.1%
Other values (3) 9
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 204718
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
o 101461
49.6%
N 101458
49.6%
S 295
 
0.1%
t 295
 
0.1%
e 295
 
0.1%
a 295
 
0.1%
d 295
 
0.1%
y 295
 
0.1%
U 10
 
< 0.1%
p 10
 
< 0.1%
Other values (3) 9
 
< 0.1%

miglitol
Categorical

High correlation  Imbalance 

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.9 MiB
No
101728 
Steady
 
31
Down
 
5
Up
 
2

Length

Max length6
Median length2
Mean length2.0013167
Min length2

Characters and Unicode

Total characters203666
Distinct characters13
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNo
2nd rowNo
3rd rowNo
4th rowNo
5th rowNo

Common Values

ValueCountFrequency (%)
No 101728
> 99.9%
Steady 31
 
< 0.1%
Down 5
 
< 0.1%
Up 2
 
< 0.1%

Length

2025-09-12T12:57:58.426839image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-09-12T12:57:58.501952image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
no 101728
> 99.9%
steady 31
 
< 0.1%
down 5
 
< 0.1%
up 2
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
o 101733
50.0%
N 101728
49.9%
S 31
 
< 0.1%
t 31
 
< 0.1%
e 31
 
< 0.1%
a 31
 
< 0.1%
d 31
 
< 0.1%
y 31
 
< 0.1%
D 5
 
< 0.1%
w 5
 
< 0.1%
Other values (3) 9
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 203666
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
o 101733
50.0%
N 101728
49.9%
S 31
 
< 0.1%
t 31
 
< 0.1%
e 31
 
< 0.1%
a 31
 
< 0.1%
d 31
 
< 0.1%
y 31
 
< 0.1%
D 5
 
< 0.1%
w 5
 
< 0.1%
Other values (3) 9
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 203666
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
o 101733
50.0%
N 101728
49.9%
S 31
 
< 0.1%
t 31
 
< 0.1%
e 31
 
< 0.1%
a 31
 
< 0.1%
d 31
 
< 0.1%
y 31
 
< 0.1%
D 5
 
< 0.1%
w 5
 
< 0.1%
Other values (3) 9
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 203666
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
o 101733
50.0%
N 101728
49.9%
S 31
 
< 0.1%
t 31
 
< 0.1%
e 31
 
< 0.1%
a 31
 
< 0.1%
d 31
 
< 0.1%
y 31
 
< 0.1%
D 5
 
< 0.1%
w 5
 
< 0.1%
Other values (3) 9
 
< 0.1%

troglitazone
Categorical

High correlation  Imbalance 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.9 MiB
No
101763 
Steady
 
3

Length

Max length6
Median length2
Mean length2.0001179
Min length2

Characters and Unicode

Total characters203544
Distinct characters8
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNo
2nd rowNo
3rd rowNo
4th rowNo
5th rowNo

Common Values

ValueCountFrequency (%)
No 101763
> 99.9%
Steady 3
 
< 0.1%

Length

2025-09-12T12:57:58.600841image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-09-12T12:57:58.658878image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
no 101763
> 99.9%
steady 3
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
N 101763
50.0%
o 101763
50.0%
S 3
 
< 0.1%
t 3
 
< 0.1%
e 3
 
< 0.1%
a 3
 
< 0.1%
d 3
 
< 0.1%
y 3
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 203544
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
N 101763
50.0%
o 101763
50.0%
S 3
 
< 0.1%
t 3
 
< 0.1%
e 3
 
< 0.1%
a 3
 
< 0.1%
d 3
 
< 0.1%
y 3
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 203544
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
N 101763
50.0%
o 101763
50.0%
S 3
 
< 0.1%
t 3
 
< 0.1%
e 3
 
< 0.1%
a 3
 
< 0.1%
d 3
 
< 0.1%
y 3
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 203544
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
N 101763
50.0%
o 101763
50.0%
S 3
 
< 0.1%
t 3
 
< 0.1%
e 3
 
< 0.1%
a 3
 
< 0.1%
d 3
 
< 0.1%
y 3
 
< 0.1%

tolazamide
Categorical

Imbalance 

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.9 MiB
No
101727 
Steady
 
38
Up
 
1

Length

Max length6
Median length2
Mean length2.0014936
Min length2

Characters and Unicode

Total characters203684
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowNo
2nd rowNo
3rd rowNo
4th rowNo
5th rowNo

Common Values

ValueCountFrequency (%)
No 101727
> 99.9%
Steady 38
 
< 0.1%
Up 1
 
< 0.1%

Length

2025-09-12T12:57:58.732425image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-09-12T12:57:58.797491image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
no 101727
> 99.9%
steady 38
 
< 0.1%
up 1
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
N 101727
49.9%
o 101727
49.9%
S 38
 
< 0.1%
t 38
 
< 0.1%
e 38
 
< 0.1%
a 38
 
< 0.1%
d 38
 
< 0.1%
y 38
 
< 0.1%
U 1
 
< 0.1%
p 1
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 203684
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
N 101727
49.9%
o 101727
49.9%
S 38
 
< 0.1%
t 38
 
< 0.1%
e 38
 
< 0.1%
a 38
 
< 0.1%
d 38
 
< 0.1%
y 38
 
< 0.1%
U 1
 
< 0.1%
p 1
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 203684
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
N 101727
49.9%
o 101727
49.9%
S 38
 
< 0.1%
t 38
 
< 0.1%
e 38
 
< 0.1%
a 38
 
< 0.1%
d 38
 
< 0.1%
y 38
 
< 0.1%
U 1
 
< 0.1%
p 1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 203684
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
N 101727
49.9%
o 101727
49.9%
S 38
 
< 0.1%
t 38
 
< 0.1%
e 38
 
< 0.1%
a 38
 
< 0.1%
d 38
 
< 0.1%
y 38
 
< 0.1%
U 1
 
< 0.1%
p 1
 
< 0.1%

examide
Boolean

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size99.5 KiB
False
101766 
ValueCountFrequency (%)
False 101766
100.0%
2025-09-12T12:57:58.835460image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

citoglipton
Boolean

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size99.5 KiB
False
101766 
ValueCountFrequency (%)
False 101766
100.0%
2025-09-12T12:57:58.865027image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

insulin
Categorical

High correlation 

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size5.1 MiB
No
47383 
Steady
30849 
Down
12218 
Up
11316 

Length

Max length6
Median length2
Mean length3.4526659
Min length2

Characters and Unicode

Total characters351364
Distinct characters13
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNo
2nd rowUp
3rd rowNo
4th rowUp
5th rowSteady

Common Values

ValueCountFrequency (%)
No 47383
46.6%
Steady 30849
30.3%
Down 12218
 
12.0%
Up 11316
 
11.1%

Length

2025-09-12T12:57:58.935951image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-09-12T12:57:59.009253image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
no 47383
46.6%
steady 30849
30.3%
down 12218
 
12.0%
up 11316
 
11.1%

Most occurring characters

ValueCountFrequency (%)
o 59601
17.0%
N 47383
13.5%
S 30849
8.8%
t 30849
8.8%
e 30849
8.8%
a 30849
8.8%
d 30849
8.8%
y 30849
8.8%
D 12218
 
3.5%
w 12218
 
3.5%
Other values (3) 34850
9.9%

Most occurring categories

ValueCountFrequency (%)
(unknown) 351364
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
o 59601
17.0%
N 47383
13.5%
S 30849
8.8%
t 30849
8.8%
e 30849
8.8%
a 30849
8.8%
d 30849
8.8%
y 30849
8.8%
D 12218
 
3.5%
w 12218
 
3.5%
Other values (3) 34850
9.9%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 351364
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
o 59601
17.0%
N 47383
13.5%
S 30849
8.8%
t 30849
8.8%
e 30849
8.8%
a 30849
8.8%
d 30849
8.8%
y 30849
8.8%
D 12218
 
3.5%
w 12218
 
3.5%
Other values (3) 34850
9.9%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 351364
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
o 59601
17.0%
N 47383
13.5%
S 30849
8.8%
t 30849
8.8%
e 30849
8.8%
a 30849
8.8%
d 30849
8.8%
y 30849
8.8%
D 12218
 
3.5%
w 12218
 
3.5%
Other values (3) 34850
9.9%

glyburide-metformin
Categorical

Imbalance 

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size5.0 MiB
No
101060 
Steady
 
692
Up
 
8
Down
 
6

Length

Max length6
Median length2
Mean length2.0273176
Min length2

Characters and Unicode

Total characters206312
Distinct characters13
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNo
2nd rowNo
3rd rowNo
4th rowNo
5th rowNo

Common Values

ValueCountFrequency (%)
No 101060
99.3%
Steady 692
 
0.7%
Up 8
 
< 0.1%
Down 6
 
< 0.1%

Length

2025-09-12T12:57:59.130876image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-09-12T12:57:59.208686image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
no 101060
99.3%
steady 692
 
0.7%
up 8
 
< 0.1%
down 6
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
o 101066
49.0%
N 101060
49.0%
S 692
 
0.3%
t 692
 
0.3%
e 692
 
0.3%
a 692
 
0.3%
d 692
 
0.3%
y 692
 
0.3%
U 8
 
< 0.1%
p 8
 
< 0.1%
Other values (3) 18
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 206312
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
o 101066
49.0%
N 101060
49.0%
S 692
 
0.3%
t 692
 
0.3%
e 692
 
0.3%
a 692
 
0.3%
d 692
 
0.3%
y 692
 
0.3%
U 8
 
< 0.1%
p 8
 
< 0.1%
Other values (3) 18
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 206312
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
o 101066
49.0%
N 101060
49.0%
S 692
 
0.3%
t 692
 
0.3%
e 692
 
0.3%
a 692
 
0.3%
d 692
 
0.3%
y 692
 
0.3%
U 8
 
< 0.1%
p 8
 
< 0.1%
Other values (3) 18
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 206312
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
o 101066
49.0%
N 101060
49.0%
S 692
 
0.3%
t 692
 
0.3%
e 692
 
0.3%
a 692
 
0.3%
d 692
 
0.3%
y 692
 
0.3%
U 8
 
< 0.1%
p 8
 
< 0.1%
Other values (3) 18
 
< 0.1%

glipizide-metformin
Categorical

High correlation  Imbalance 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.9 MiB
No
101753 
Steady
 
13

Length

Max length6
Median length2
Mean length2.000511
Min length2

Characters and Unicode

Total characters203584
Distinct characters8
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNo
2nd rowNo
3rd rowNo
4th rowNo
5th rowNo

Common Values

ValueCountFrequency (%)
No 101753
> 99.9%
Steady 13
 
< 0.1%

Length

2025-09-12T12:57:59.313202image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-09-12T12:57:59.391892image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
no 101753
> 99.9%
steady 13
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
N 101753
50.0%
o 101753
50.0%
S 13
 
< 0.1%
t 13
 
< 0.1%
e 13
 
< 0.1%
a 13
 
< 0.1%
d 13
 
< 0.1%
y 13
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 203584
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
N 101753
50.0%
o 101753
50.0%
S 13
 
< 0.1%
t 13
 
< 0.1%
e 13
 
< 0.1%
a 13
 
< 0.1%
d 13
 
< 0.1%
y 13
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 203584
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
N 101753
50.0%
o 101753
50.0%
S 13
 
< 0.1%
t 13
 
< 0.1%
e 13
 
< 0.1%
a 13
 
< 0.1%
d 13
 
< 0.1%
y 13
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 203584
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
N 101753
50.0%
o 101753
50.0%
S 13
 
< 0.1%
t 13
 
< 0.1%
e 13
 
< 0.1%
a 13
 
< 0.1%
d 13
 
< 0.1%
y 13
 
< 0.1%

glimepiride-pioglitazone
Categorical

High correlation  Imbalance 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.9 MiB
No
101765 
Steady
 
1

Length

Max length6
Median length2
Mean length2.0000393
Min length2

Characters and Unicode

Total characters203536
Distinct characters8
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowNo
2nd rowNo
3rd rowNo
4th rowNo
5th rowNo

Common Values

ValueCountFrequency (%)
No 101765
> 99.9%
Steady 1
 
< 0.1%

Length

2025-09-12T12:57:59.487576image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-09-12T12:57:59.552653image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
no 101765
> 99.9%
steady 1
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
N 101765
50.0%
o 101765
50.0%
S 1
 
< 0.1%
t 1
 
< 0.1%
e 1
 
< 0.1%
a 1
 
< 0.1%
d 1
 
< 0.1%
y 1
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 203536
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
N 101765
50.0%
o 101765
50.0%
S 1
 
< 0.1%
t 1
 
< 0.1%
e 1
 
< 0.1%
a 1
 
< 0.1%
d 1
 
< 0.1%
y 1
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 203536
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
N 101765
50.0%
o 101765
50.0%
S 1
 
< 0.1%
t 1
 
< 0.1%
e 1
 
< 0.1%
a 1
 
< 0.1%
d 1
 
< 0.1%
y 1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 203536
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
N 101765
50.0%
o 101765
50.0%
S 1
 
< 0.1%
t 1
 
< 0.1%
e 1
 
< 0.1%
a 1
 
< 0.1%
d 1
 
< 0.1%
y 1
 
< 0.1%

metformin-rosiglitazone
Categorical

High correlation  Imbalance 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.9 MiB
No
101764 
Steady
 
2

Length

Max length6
Median length2
Mean length2.0000786
Min length2

Characters and Unicode

Total characters203540
Distinct characters8
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNo
2nd rowNo
3rd rowNo
4th rowNo
5th rowNo

Common Values

ValueCountFrequency (%)
No 101764
> 99.9%
Steady 2
 
< 0.1%

Length

2025-09-12T12:57:59.621994image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-09-12T12:57:59.678990image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
no 101764
> 99.9%
steady 2
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
N 101764
50.0%
o 101764
50.0%
S 2
 
< 0.1%
t 2
 
< 0.1%
e 2
 
< 0.1%
a 2
 
< 0.1%
d 2
 
< 0.1%
y 2
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 203540
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
N 101764
50.0%
o 101764
50.0%
S 2
 
< 0.1%
t 2
 
< 0.1%
e 2
 
< 0.1%
a 2
 
< 0.1%
d 2
 
< 0.1%
y 2
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 203540
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
N 101764
50.0%
o 101764
50.0%
S 2
 
< 0.1%
t 2
 
< 0.1%
e 2
 
< 0.1%
a 2
 
< 0.1%
d 2
 
< 0.1%
y 2
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 203540
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
N 101764
50.0%
o 101764
50.0%
S 2
 
< 0.1%
t 2
 
< 0.1%
e 2
 
< 0.1%
a 2
 
< 0.1%
d 2
 
< 0.1%
y 2
 
< 0.1%

metformin-pioglitazone
Categorical

High correlation  Imbalance 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.9 MiB
No
101765 
Steady
 
1

Length

Max length6
Median length2
Mean length2.0000393
Min length2

Characters and Unicode

Total characters203536
Distinct characters8
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowNo
2nd rowNo
3rd rowNo
4th rowNo
5th rowNo

Common Values

ValueCountFrequency (%)
No 101765
> 99.9%
Steady 1
 
< 0.1%

Length

2025-09-12T12:57:59.749224image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-09-12T12:57:59.804459image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
no 101765
> 99.9%
steady 1
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
N 101765
50.0%
o 101765
50.0%
S 1
 
< 0.1%
t 1
 
< 0.1%
e 1
 
< 0.1%
a 1
 
< 0.1%
d 1
 
< 0.1%
y 1
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 203536
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
N 101765
50.0%
o 101765
50.0%
S 1
 
< 0.1%
t 1
 
< 0.1%
e 1
 
< 0.1%
a 1
 
< 0.1%
d 1
 
< 0.1%
y 1
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 203536
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
N 101765
50.0%
o 101765
50.0%
S 1
 
< 0.1%
t 1
 
< 0.1%
e 1
 
< 0.1%
a 1
 
< 0.1%
d 1
 
< 0.1%
y 1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 203536
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
N 101765
50.0%
o 101765
50.0%
S 1
 
< 0.1%
t 1
 
< 0.1%
e 1
 
< 0.1%
a 1
 
< 0.1%
d 1
 
< 0.1%
y 1
 
< 0.1%

change
Categorical

High correlation 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.9 MiB
No
54755 
Ch
47011 

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters203532
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNo
2nd rowCh
3rd rowNo
4th rowCh
5th rowCh

Common Values

ValueCountFrequency (%)
No 54755
53.8%
Ch 47011
46.2%

Length

2025-09-12T12:57:59.871415image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-09-12T12:57:59.923992image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
no 54755
53.8%
ch 47011
46.2%

Most occurring characters

ValueCountFrequency (%)
N 54755
26.9%
o 54755
26.9%
C 47011
23.1%
h 47011
23.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 203532
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
N 54755
26.9%
o 54755
26.9%
C 47011
23.1%
h 47011
23.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 203532
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
N 54755
26.9%
o 54755
26.9%
C 47011
23.1%
h 47011
23.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 203532
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
N 54755
26.9%
o 54755
26.9%
C 47011
23.1%
h 47011
23.1%

diabetesMed
Boolean

High correlation 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size99.5 KiB
True
78363 
False
23403 
ValueCountFrequency (%)
True 78363
77.0%
False 23403
 
23.0%
2025-09-12T12:57:59.963189image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

readmitted
Categorical

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size5.0 MiB
NO
54864 
>30
35545 
<30
11357 

Length

Max length3
Median length2
Mean length2.4608808
Min length2

Characters and Unicode

Total characters250434
Distinct characters6
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNO
2nd row>30
3rd rowNO
4th rowNO
5th rowNO

Common Values

ValueCountFrequency (%)
NO 54864
53.9%
>30 35545
34.9%
<30 11357
 
11.2%

Length

2025-09-12T12:58:00.033941image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-09-12T12:58:00.092921image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
no 54864
53.9%
30 46902
46.1%

Most occurring characters

ValueCountFrequency (%)
N 54864
21.9%
O 54864
21.9%
3 46902
18.7%
0 46902
18.7%
> 35545
14.2%
< 11357
 
4.5%

Most occurring categories

ValueCountFrequency (%)
(unknown) 250434
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
N 54864
21.9%
O 54864
21.9%
3 46902
18.7%
0 46902
18.7%
> 35545
14.2%
< 11357
 
4.5%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 250434
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
N 54864
21.9%
O 54864
21.9%
3 46902
18.7%
0 46902
18.7%
> 35545
14.2%
< 11357
 
4.5%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 250434
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
N 54864
21.9%
O 54864
21.9%
3 46902
18.7%
0 46902
18.7%
> 35545
14.2%
< 11357
 
4.5%

Interactions

2025-09-12T12:57:46.698960image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:29.266506image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:30.738586image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:32.205944image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:33.702395image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:35.106704image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:36.537466image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:38.369945image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:39.864711image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:41.196139image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:42.879644image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:44.179232image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:45.491235image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:46.793657image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:29.370912image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:30.845433image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:32.317425image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:33.791071image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:35.195418image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:36.679858image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:38.486992image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:39.954239image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:41.296584image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:42.965497image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:44.268154image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:45.578389image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:46.893325image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:29.493870image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:30.963007image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:32.445116image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:33.895020image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:35.290792image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:36.816863image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:38.614930image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:40.060851image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:41.401844image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:43.070254image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:44.378737image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:45.672684image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:46.990128image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:29.590625image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:31.068034image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:32.542895image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:34.000312image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:35.388738image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:36.935850image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:38.756272image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:40.163257image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:41.503864image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:43.198367image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:44.488646image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:45.765738image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:47.094364image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:29.682152image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:31.171659image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:32.638121image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:34.088463image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:35.490712image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:37.071091image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:38.903351image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:40.260943image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:41.606282image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:43.317411image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:44.600932image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:45.850470image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:47.318179image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:29.776013image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:31.277577image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:32.741813image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:34.250336image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:35.597385image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:37.168809image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:39.020474image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:40.358125image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:41.722107image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:43.411948image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:44.695600image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:45.938520image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:47.425381image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:29.875454image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:31.387828image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:32.854825image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:34.360038image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:35.712604image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:37.260416image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:39.122677image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:40.456327image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:41.831681image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:43.502322image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:44.786673image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:46.038733image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:47.528831image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:29.976857image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:31.518275image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:32.957743image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:34.477479image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:35.811502image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:37.365501image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:39.221211image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:40.553748image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:41.952645image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:43.599862image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:44.896640image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:46.132693image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:47.644068image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:30.098506image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:31.629360image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:33.057570image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:34.577500image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:35.905050image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:37.479092image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:39.324357image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:40.644606image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:42.088432image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:43.691860image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:44.990011image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:46.223737image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:47.757315image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:30.250171image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:31.738636image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:33.166575image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:34.676071image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:36.016498image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:37.665456image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:39.432017image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:40.746502image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:42.215272image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:43.791613image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:45.092799image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:46.316881image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:47.854923image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:30.364084image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:31.834753image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:33.257530image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:34.768563image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:36.127705image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:37.820878image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:39.538703image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:40.839338image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:42.329161image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:43.883089image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:45.181385image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:46.399551image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:47.958167image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:30.522488image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:31.973657image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:33.470413image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:34.906751image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:36.282489image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:37.964955image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:39.644972image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:40.982693image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:42.505828image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:43.992209image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:45.284488image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:46.506756image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:48.061987image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:30.621670image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:32.075007image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:33.593255image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:35.010318image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:36.399337image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:38.121233image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:39.753712image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:41.096576image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:42.751557image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:44.085921image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:45.385836image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-09-12T12:57:46.607929image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Correlations

2025-09-12T12:58:00.205737image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
A1Cresultacarboseacetohexamideadmission_source_idadmission_type_idagechangechlorpropamidediabetesMeddischarge_disposition_idencounter_idgenderglimepirideglimepiride-pioglitazoneglipizideglipizide-metforminglyburideglyburide-metformininsulinmax_glu_serummetforminmetformin-pioglitazonemetformin-rosiglitazonemiglitolnateglinidenum_lab_proceduresnum_medicationsnum_proceduresnumber_diagnosesnumber_emergencynumber_inpatientnumber_outpatientpatient_nbrpayer_codepioglitazoneracereadmittedrepagliniderosiglitazonetime_in_hospitaltolazamidetolbutamidetroglitazoneweight
A1Cresult1.0000.0091.0000.0420.0690.1830.1870.0030.1820.0400.1320.0330.0321.0000.0400.0170.0320.0000.1530.3700.0541.0000.0000.0000.0000.0300.0290.0240.1120.0070.0220.0190.1180.1580.0140.0640.0190.0270.0150.0220.0030.0031.0000.016
acarbose0.0091.0000.0000.0000.0000.0020.0460.0000.0300.0000.0060.0070.0100.0000.0220.0000.0070.0040.0110.0160.0130.0000.0000.0010.0000.0000.0130.0040.0000.0000.0070.0000.0110.0100.0070.0070.0120.0120.0020.0070.0000.0000.0000.000
acetohexamide1.0000.0001.0000.0000.0000.0000.0000.0000.0000.0200.0000.0000.0000.0000.0000.0000.0000.0000.0001.0000.0000.0000.0000.0000.0000.0040.0190.0130.0000.0000.0000.0000.0000.0000.0000.0000.0000.0000.0000.0190.0000.0000.0000.000
admission_source_id0.0420.0000.0001.000-0.3830.0350.0230.0000.0180.042-0.0510.0120.0160.0000.0070.0000.0190.0170.0410.1480.0290.0000.0000.0000.0070.136-0.063-0.2050.1060.1040.0560.0240.0300.0810.0170.0740.0560.0180.0210.0030.0000.0040.0000.031
admission_type_id0.0690.0000.000-0.3831.0000.0380.0630.0020.0430.021-0.1230.0130.0360.0000.0120.0000.0070.0270.0640.1230.0320.0000.0000.0050.012-0.2240.0870.217-0.127-0.033-0.0450.0300.0070.1350.0200.0630.0440.0340.019-0.0150.0060.0130.0000.043
age0.1830.0020.0000.0350.0381.0000.0560.0030.0440.0600.0370.0780.0240.0000.0370.0000.0500.0100.0680.1330.0660.0000.0000.0040.0080.0230.0600.0650.1310.0270.0490.0040.0390.1530.0300.0850.0380.0290.0260.0430.0000.0140.0000.026
change0.1870.0460.0000.0230.0630.0561.0000.0120.5060.0810.1200.0140.1440.0000.2090.0070.1910.0430.6410.2480.3290.0000.0000.0140.0550.0700.2440.0270.0570.0150.0170.0150.1300.1480.2030.0210.0460.0780.1960.1150.0000.0000.0030.048
chlorpropamide0.0030.0000.0000.0000.0020.0030.0121.0000.0150.0190.0140.0000.0000.0000.0020.0000.0000.0000.0100.0000.0030.0000.0000.0000.0000.0000.0000.0030.0060.0000.0000.0000.0060.0030.0000.0030.0040.0000.0000.0030.0000.0000.0000.000
diabetesMed0.1820.0300.0000.0180.0430.0440.5060.0151.0000.0830.0680.0150.1270.0000.2060.0040.1870.0450.5850.1910.2700.0000.0000.0090.0450.0430.1960.0300.0320.0070.0180.0010.0680.0950.1520.0220.0610.0680.1410.0700.0100.0070.0000.036
discharge_disposition_id0.0400.0000.0200.0420.0210.0600.0810.0190.0831.000-0.0650.0270.0220.0000.0280.0100.0510.0150.0780.0710.0360.0000.0000.0040.0060.0590.1710.0130.1510.0070.0850.033-0.0460.0940.0240.0280.1200.0160.0170.2760.0170.0100.0110.016
encounter_id0.1320.0060.000-0.051-0.1230.0370.1200.0140.068-0.0651.0000.0110.0300.0010.0230.0040.0540.0290.1020.1770.0280.0140.0110.0050.022-0.0090.102-0.0310.2930.1310.0370.1510.5440.2440.0360.0780.0730.0190.044-0.0600.0140.0100.0130.020
gender0.0330.0070.0000.0120.0130.0780.0140.0000.0150.0270.0111.0000.0000.0000.0190.0050.0230.0000.0000.0000.0000.0000.0020.0040.0000.0170.0360.0450.0000.0000.0080.0000.0220.0610.0040.0540.0130.0000.0110.0280.0030.0000.0040.027
glimepiride0.0320.0100.0000.0160.0360.0240.1440.0000.1270.0220.0300.0001.0000.0000.0420.0000.0400.0040.0100.0000.0280.0000.0000.0130.0090.0190.0290.0070.0100.0090.0000.0000.0230.0330.0260.0140.0070.0030.0250.0250.0000.0000.0050.007
glimepiride-pioglitazone1.0000.0000.0000.0000.0000.0000.0000.0000.0000.0000.0010.0000.0001.0000.0000.0000.0000.0000.0001.0000.0000.0000.0000.0000.0000.0000.0000.0000.0050.0000.0000.0000.0000.0050.0000.0000.0000.0000.0000.0000.0000.0000.0000.000
glipizide0.0400.0220.0000.0070.0120.0370.2090.0020.2060.0280.0230.0190.0420.0001.0000.0000.0620.0150.0340.0540.0490.0000.0000.0140.0070.0240.0420.0090.0130.0000.0120.0000.0210.0170.0290.0140.0150.0100.0270.0370.0000.0020.0000.013
glipizide-metformin0.0170.0000.0000.0000.0000.0000.0070.0000.0040.0100.0040.0050.0000.0000.0001.0000.0000.0300.0001.0000.0000.0000.0000.0000.0000.0080.0000.0000.0000.0000.0000.0000.0240.0270.0000.0000.0010.0000.0000.0050.0000.0000.0000.000
glyburide0.0320.0070.0000.0190.0070.0500.1910.0000.1870.0510.0540.0230.0400.0000.0620.0001.0000.0040.0540.0320.0930.0000.0000.0000.0110.0200.0300.0070.0240.0000.0200.0050.0440.0400.0160.0170.0040.0140.0250.0330.0000.0000.0000.004
glyburide-metformin0.0000.0040.0000.0170.0270.0100.0430.0000.0450.0150.0290.0000.0040.0000.0150.0300.0041.0000.0050.0290.0120.0000.0000.0000.0040.0060.0030.0000.0120.0200.0000.0000.0320.0390.0180.0180.0040.0030.0020.0030.0000.0000.0000.000
insulin0.1530.0110.0000.0410.0640.0680.6410.0100.5850.0780.1020.0000.0100.0000.0340.0000.0540.0051.0000.2230.0320.0000.0030.0040.0040.0730.1430.0230.0780.0170.0440.0180.1180.1310.0090.0420.0500.0180.0130.0790.0080.0000.0000.055
max_glu_serum0.3700.0161.0000.1480.1230.1330.2480.0000.1910.0710.1770.0000.0001.0000.0541.0000.0320.0290.2231.0000.0481.0001.0001.0000.0240.1500.1370.0360.0540.0140.0780.0000.1570.1140.0130.0000.0540.0250.0180.1380.0000.0001.0001.000
metformin0.0540.0130.0000.0290.0320.0660.3290.0030.2700.0360.0280.0000.0280.0000.0490.0000.0930.0120.0320.0481.0000.0410.0000.0080.0130.0370.0440.0230.0460.0000.0320.0070.0210.0440.0340.0120.0220.0090.0610.0280.0080.0050.0000.014
metformin-pioglitazone1.0000.0000.0000.0000.0000.0000.0000.0000.0000.0000.0140.0000.0000.0000.0000.0000.0000.0000.0001.0000.0411.0000.0000.0000.0000.0010.0000.0000.0050.0000.0000.0000.0000.0150.0100.0000.0000.0000.0000.0000.0000.0000.0000.000
metformin-rosiglitazone0.0000.0000.0000.0000.0000.0000.0000.0000.0000.0000.0110.0020.0000.0000.0000.0000.0000.0000.0031.0000.0000.0001.0000.0000.0000.0000.0380.0060.0000.0000.0000.0000.0230.0150.0000.0280.0000.0000.0000.0000.0000.0000.0000.000
miglitol0.0000.0010.0000.0000.0050.0040.0140.0000.0090.0040.0050.0040.0130.0000.0140.0000.0000.0000.0041.0000.0080.0000.0001.0000.0050.0000.0000.0000.0000.0000.0030.0000.0090.0000.0000.0000.0050.0080.0000.0090.0000.0000.0000.000
nateglinide0.0000.0000.0000.0070.0120.0080.0550.0000.0450.0060.0220.0000.0090.0000.0070.0000.0110.0040.0040.0240.0130.0000.0000.0051.0000.0060.0150.0010.0350.0210.0000.0000.0180.0130.0200.0100.0000.0000.0090.0050.0000.0000.0000.004
num_lab_procedures0.0300.0000.0040.136-0.2240.0230.0700.0000.0430.059-0.0090.0170.0190.0000.0240.0080.0200.0060.0730.1500.0370.0010.0000.0000.0061.0000.2520.0230.1690.0060.041-0.0240.0270.0470.0180.0410.0320.0200.0110.3370.0000.0060.0000.038
num_medications0.0290.0130.019-0.0630.0870.0600.2440.0000.1960.1710.1020.0360.0290.0000.0420.0000.0300.0030.1430.1370.0440.0000.0380.0000.0150.2521.0000.3520.2940.0440.0990.0740.0450.0380.0430.0300.0630.0160.0320.4650.0000.0000.0000.008
num_procedures0.0240.0040.013-0.2050.2170.0650.0270.0030.0300.013-0.0310.0450.0070.0000.0090.0000.0070.0000.0230.0360.0230.0000.0060.0000.0010.0230.3521.0000.067-0.046-0.064-0.024-0.0190.0430.0100.0250.0370.0000.0080.1870.0070.0000.0000.011
number_diagnoses0.1120.0000.0000.106-0.1270.1310.0570.0060.0320.1510.2930.0000.0100.0050.0130.0000.0240.0120.0780.0540.0460.0050.0000.0000.0350.1690.2940.0671.0000.0920.1360.1130.2400.0790.0100.0630.0820.0220.0080.2370.0090.0000.0000.022
number_emergency0.0070.0000.0000.104-0.0330.0270.0150.0000.0070.0070.1310.0000.0090.0000.0000.0000.0000.0200.0170.0140.0000.0000.0000.0000.0210.0060.044-0.0460.0921.0000.2220.1770.1130.0340.0000.0040.0290.0000.000-0.0010.0000.0000.0000.000
number_inpatient0.0220.0070.0000.056-0.0450.0490.0170.0000.0180.0850.0370.0080.0000.0000.0120.0000.0200.0000.0440.0780.0320.0000.0000.0030.0000.0410.099-0.0640.1360.2221.0000.1560.0260.0290.0110.0140.1300.0000.0080.0920.0000.0000.0000.014
number_outpatient0.0190.0000.0000.0240.0300.0040.0150.0000.0010.0330.1510.0000.0000.0000.0000.0000.0050.0000.0180.0000.0070.0000.0000.0000.000-0.0240.074-0.0240.1130.1770.1561.0000.1550.0240.0000.0120.0280.0000.000-0.0130.0000.0000.0000.019
patient_nbr0.1180.0110.0000.0300.0070.0390.1300.0060.068-0.0460.5440.0220.0230.0000.0210.0240.0440.0320.1180.1570.0210.0000.0230.0090.0180.0270.045-0.0190.2400.1130.0260.1551.0000.1740.0320.1060.1150.0420.019-0.0170.0090.0000.0000.037
payer_code0.1580.0100.0000.0810.1350.1530.1480.0030.0950.0940.2440.0610.0330.0050.0170.0270.0400.0390.1310.1140.0440.0150.0150.0000.0130.0470.0380.0430.0790.0340.0290.0240.1741.0000.0310.0870.0490.0250.0150.0330.0000.0000.0000.054
pioglitazone0.0140.0070.0000.0170.0200.0300.2030.0000.1520.0240.0360.0040.0260.0000.0290.0000.0160.0180.0090.0130.0340.0100.0000.0000.0200.0180.0430.0100.0100.0000.0110.0000.0320.0311.0000.0150.0110.0150.0370.0230.0000.0000.0000.019
race0.0640.0070.0000.0740.0630.0850.0210.0030.0220.0280.0780.0540.0140.0000.0140.0000.0170.0180.0420.0000.0120.0000.0280.0000.0100.0410.0300.0250.0630.0040.0140.0120.1060.0870.0151.0000.0370.0160.0060.0130.0000.0000.0000.036
readmitted0.0190.0120.0000.0560.0440.0380.0460.0040.0610.1200.0730.0130.0070.0000.0150.0010.0040.0040.0500.0540.0220.0000.0000.0050.0000.0320.0630.0370.0820.0290.1300.0280.1150.0490.0110.0371.0000.0160.0130.0480.0020.0000.0000.035
repaglinide0.0270.0120.0000.0180.0340.0290.0780.0000.0680.0160.0190.0000.0030.0000.0100.0000.0140.0030.0180.0250.0090.0000.0000.0080.0000.0200.0160.0000.0220.0000.0000.0000.0420.0250.0150.0160.0161.0000.0060.0240.0000.0000.0000.000
rosiglitazone0.0150.0020.0000.0210.0190.0260.1960.0000.1410.0170.0440.0110.0250.0000.0270.0000.0250.0020.0130.0180.0610.0000.0000.0000.0090.0110.0320.0080.0080.0000.0080.0000.0190.0150.0370.0060.0130.0061.0000.0210.0000.0000.0030.004
time_in_hospital0.0220.0070.0190.003-0.0150.0430.1150.0030.0700.276-0.0600.0280.0250.0000.0370.0050.0330.0030.0790.1380.0280.0000.0000.0090.0050.3370.4650.1870.237-0.0010.092-0.013-0.0170.0330.0230.0130.0480.0240.0211.0000.0000.0000.0130.010
tolazamide0.0030.0000.0000.0000.0060.0000.0000.0000.0100.0170.0140.0030.0000.0000.0000.0000.0000.0000.0080.0000.0080.0000.0000.0000.0000.0000.0000.0070.0090.0000.0000.0000.0090.0000.0000.0000.0020.0000.0000.0001.0000.0000.0000.000
tolbutamide0.0030.0000.0000.0040.0130.0140.0000.0000.0070.0100.0100.0000.0000.0000.0020.0000.0000.0000.0000.0000.0050.0000.0000.0000.0000.0060.0000.0000.0000.0000.0000.0000.0000.0000.0000.0000.0000.0000.0000.0000.0001.0000.0000.000
troglitazone1.0000.0000.0000.0000.0000.0000.0030.0000.0000.0110.0130.0040.0050.0000.0000.0000.0000.0000.0001.0000.0000.0000.0000.0000.0000.0000.0000.0000.0000.0000.0000.0000.0000.0000.0000.0000.0000.0000.0030.0130.0000.0001.0000.000
weight0.0160.0000.0000.0310.0430.0260.0480.0000.0360.0160.0200.0270.0070.0000.0130.0000.0040.0000.0551.0000.0140.0000.0000.0000.0040.0380.0080.0110.0220.0000.0140.0190.0370.0540.0190.0360.0350.0000.0040.0100.0000.0000.0001.000

Missing values

2025-09-12T12:57:48.390210image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
A simple visualization of nullity by column.
2025-09-12T12:57:49.054755image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2025-09-12T12:57:49.752476image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

encounter_idpatient_nbrracegenderageweightadmission_type_iddischarge_disposition_idadmission_source_idtime_in_hospitalpayer_codemedical_specialtynum_lab_proceduresnum_proceduresnum_medicationsnumber_outpatientnumber_emergencynumber_inpatientdiag_1diag_2diag_3number_diagnosesmax_glu_serumA1Cresultmetforminrepaglinidenateglinidechlorpropamideglimepirideacetohexamideglipizideglyburidetolbutamidepioglitazonerosiglitazoneacarbosemiglitoltroglitazonetolazamideexamidecitogliptoninsulinglyburide-metforminglipizide-metforminglimepiride-pioglitazonemetformin-rosiglitazonemetformin-pioglitazonechangediabetesMedreadmitted
022783928222157CaucasianFemale[0-10)?62511?Pediatrics-Endocrinology4101000250.83??1NaNNaNNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoNO
114919055629189CaucasianFemale[10-20)?1173??59018000276250.012559NaNNaNNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoUpNoNoNoNoNoChYes>30
26441086047875AfricanAmericanFemale[20-30)?1172??11513201648250V276NaNNaNNoNoNoNoNoNoSteadyNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoYesNO
350036482442376CaucasianMale[30-40)?1172??441160008250.434037NaNNaNNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoUpNoNoNoNoNoChYesNO
41668042519267CaucasianMale[40-50)?1171??51080001971572505NaNNaNNoNoNoNoNoNoSteadyNoNoNoNoNoNoNoNoNoNoSteadyNoNoNoNoNoChYesNO
53575482637451CaucasianMale[50-60)?2123??316160004144112509NaNNaNNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoSteadyNoNoNoNoNoNoYes>30
65584284259809CaucasianMale[60-70)?3124??70121000414411V457NaNNaNSteadyNoNoNoSteadyNoNoNoNoNoNoNoNoNoNoNoNoSteadyNoNoNoNoNoChYesNO
763768114882984CaucasianMale[70-80)?1175??730120004284922508NaNNaNNoNoNoNoNoNoNoSteadyNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoYes>30
81252248330783CaucasianFemale[80-90)?21413??68228000398427388NaNNaNNoNoNoNoNoNoSteadyNoNoNoNoNoNoNoNoNoNoSteadyNoNoNoNoNoChYesNO
91573863555939CaucasianFemale[90-100)?33412?InternalMedicine333180004341984868NaNNaNNoNoNoNoNoNoNoNoNoNoSteadyNoNoNoNoNoNoSteadyNoNoNoNoNoChYesNO
encounter_idpatient_nbrracegenderageweightadmission_type_iddischarge_disposition_idadmission_source_idtime_in_hospitalpayer_codemedical_specialtynum_lab_proceduresnum_proceduresnum_medicationsnumber_outpatientnumber_emergencynumber_inpatientdiag_1diag_2diag_3number_diagnosesmax_glu_serumA1Cresultmetforminrepaglinidenateglinidechlorpropamideglimepirideacetohexamideglipizideglyburidetolbutamidepioglitazonerosiglitazoneacarbosemiglitoltroglitazonetolazamideexamidecitogliptoninsulinglyburide-metforminglipizide-metforminglimepiride-pioglitazonemetformin-rosiglitazonemetformin-pioglitazonechangediabetesMedreadmitted
101756443842070140199494OtherFemale[60-70)?1172MD?466171119965854039NaNNaNNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoSteadyNoNoNoNoNoNoYes>30
101757443842136181593374CaucasianFemale[70-80)?1175??211160014915185119NaNNaNNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoSteadyNoNoNoNoNoNoYesNO
101758443842340120975314CaucasianFemale[80-90)?1175MC?7612201029283049NaNNaNNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoUpNoNoNoNoNoChYesNO
10175944384277886472243CaucasianMale[80-90)?1171MC?10153004357842507NaNNaNNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoUpNoNoNoNoNoChYesNO
10176044384717650375628AfricanAmericanFemale[60-70)?1176DM?451253123454384129NaNNaNNoNoNoNoNoNoNoNoNoNoSteadyNoNoNoNoNoNoDownNoNoNoNoNoChYes>30
101761443847548100162476AfricanAmericanMale[70-80)?1373MC?51016000250.132914589NaN>8SteadyNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoDownNoNoNoNoNoChYes>30
10176244384778274694222AfricanAmericanFemale[80-90)?1455MC?333180015602767879NaNNaNNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoSteadyNoNoNoNoNoNoYesNO
10176344385414841088789CaucasianMale[70-80)?1171MC?53091003859029613NaNNaNSteadyNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoDownNoNoNoNoNoChYesNO
10176444385716631693671CaucasianFemale[80-90)?23710MCSurgery-General452210019962859989NaNNaNNoNoNoNoNoNoSteadyNoNoSteadyNoNoNoNoNoNoNoUpNoNoNoNoNoChYesNO
101765443867222175429310CaucasianMale[70-80)?1176??13330005305307879NaNNaNNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoNO